# temporalized__49aabd4f.pdf

Temporalized EL Ontologies for Accessing Temporal Data: Complexity of Atomic Queries

V ıctor Guti errez-Basulto1 and Jean Christoph Jung1 and Roman Kontchakov2

1 Department of Computer Science, Universit at Bremen, Germany 2 Department of Computer Science and Information Systems, Birkbeck, University of London, UK

We study access to temporal data with TEL, a temporal extension of the tractable description logic EL. Our aim is to establish a clear computational complexity landscape for the atomic query answering problem, in terms of both data and combined complexity. Atomic queries in full TEL turn out to be undecidable even in data complexity. Motivated by the negative result, we identify well-behaved yet expressive fragments of TEL. Our main contributions are a semantic and sufﬁcient syntactic conditions for decidability and three orthogonal tractable fragments, which are based on restricted use of rigid roles, temporal operators, and novel acyclicity conditions on the ontologies.

1 Introduction

In recent years, the use of ontologies to enrich plain data with a semantic layer has become one of the outstanding applications of description logic (DLs) technologies in the Semantic Web. The ontology-based data access (OBDA) setting provides information systems with various advantages, e.g., a friendlier vocabulary for accessing heterogenous data is given by the ontology, and means of querying potentially incomplete data are provided by taking account of the implicit knowledge derived from the data and the ontology. Due to the increasing need to account for the temporal dimension of data available on the Web [Roth and Tan, 2013; Dong and Tan, 2015], the DL community has recently investigated extensions of the OBDA paradigm for temporal data. The initial efforts concentrated on temporal query languages with atemporal ontologies [Guti errez-Basulto and Klarman, 2012; Klarman and Meyer, 2014; Baader, Borgwardt, and Lippmann, 2015; Borgwardt, Lippmann, and Thost, 2015; Borgwardt and Thost, 2015]. On the other hand, temporal ontology languages can enhance conceptual modelling with temporal aspects [Artale et al., 2015], which are required, e.g., in applications managing data from sensor networks. In this line, the research has focused on temporal extensions of DL-Lite that support rewritability of temporal queries into the monadic second-order logic with order or into twosorted ﬁrst-order logic with < and + [Artale et al., 2013b;

2015]. Since standard relational database management systems have such built-in predicates, they can in principle evaluate the FO(<, +)-rewritings. However, no temporal extensions of other classical DLs have been investigated yet in the context of OBDA, which is partly because of the intractability and often even undecidability of the standard reasoning tasks (e.g., subsumption) [Artale et al., 2007; Guti errez-Basulto, Jung, and Lutz, 2012; Guti errez-Basulto, Jung, and Schneider, 2014]. On the other hand, temporal data has also been studied in classical database theory [Chomicki and Toman, 2005]. In their seminal paper, Chomicki and Imielinski [1988] identiﬁed DATALOG1S as a decidable extension of DATALOG with one successor function. Here we make the ﬁrst (to the best of our knowledge) attempt to link temporal OBDA with temporal deductive databases [Chomicki, 1990; Baudinet, Chomicki, and Wolper, 1993].

In this paper, we study TEL, a temporal extension of EL [Baader, Brandt, and Lutz, 2005]. The underlying DL component, EL, underpins the OWL 2 EL proﬁle of OWL 2 and the medical ontology SNOMED CT, which provides the vocabulary for electronic health records (EHRs). Indeed, applications managing EHRs must be able to provide information, e.g., on when and for how long some drug has been prescribed to a patient, so that drugs that interact adversely are not prescribed at the same time. Clinical trials [Shankar et al., 2008; O Connor et al., 2009] also require a uniﬁed conceptual model for specifying temporal constraints of protocol entities such as a viable participant should have had a vaccination with live virus 5 days ago or blood tests of a patient should be run every 3 days . These statements can be encoded in TEL:

Patient u 5

P9vaccinated.Live Virus v Viable Particip, (1)

Patient u 3

PReq Blood Test v Req Blood Test. (2)

Our main objective is to establish the limits of decidability and tractability of the query answering problem over TEL ontologies, in terms of both data and combined complexity. In order to set the foundations, we focus on temporal atomic queries. On the one hand, an atomic query Viable Particip(x, t) with the temporal concept inclusion (1) effectively encodes a tree-shaped temporal conjunctive query. On the other hand, using (1) to extend the vocabulary with a concept Viable Particip is closer to the spirit of the OBDA paradigm than repeating the same conjunction in all similar user queries. Moreover, a recurrent pattern Req Blood Test is expressible as an atomic query

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16)

Req Blood Test(x, t) with the temporal concept inclusion (2) but not expressible as a query without temporal concept inclusions like (2). As we shall see, even for atomic queries rather surprising (and challenging) results are obtained.

Our main contributions are complexity bounds, algorithms and rewritability into DATALOG1S for atomic query answering in fragments of TEL. Since query answering over unrestricted TEL turns out to be undecidable (in data complexity), we investigate its fragments to attain decidability and tractability. First, for TEL , which allows only the next F and previoustime P operators, we identify ultimate periodicity as a natural semantic condition ensuring decidability, more precisely, PSPACE data complexity (the question of decidability of the full TEL is left open for future work). Then, we identify a number of fragments with better computational properties. (i) For the fragment of TEL without rigid (not changing over time) roles on the right-hand side of concept inclusions, we construct a polynomial rewriting into DATALOG1S, and so, establish PSPACE-completeness for data complexity. This fragment contains all EL ontologies as well as both (1) and (2). (ii) Over temporally acyclic TEL -ontologies (with rigid roles and concepts), query answering is PTIME-complete in both data and combined complexity. This tractable fragment contains (1) and fully captures all atemporal EL ontologies and may prove particularly useful in applications; it, however, does not contain (2). (iii) Query answering over DL-acyclic TEL ontologies is NC1-complete for data complexity (in principle, highly parallelizable). This fragment contains many acyclic EL ontologies as well as both (1) and (2) (note that large parts of SNOMED CT are in fact acyclic). We remark that our two novel acyclicity conditions (each constraining only one dimension) are inspired by the traditional notion of acyclicity in (temporal extensions of) DLs [Haase and Lutz, 2008; Guti errez-Basulto, Jung, and Schneider, 2015]. Finally, (iv) we show that the language with only 3P and 3F (sometime in the past/future) on the left-hand side of concept inclusions also enjoys PTIME query answering.

Omitted proofs can be found at tinyurl.com/Temp EL16.

2 Preliminaries

We begin by introducing TEL, a temporal extension of the classical DL EL. Let NC, NR, NI be countably inﬁnite sets of concept, role and individual names, respectively. We assume that NR is partitioned into two inﬁnite sets, Nrig

R , of rigid and local role names, respectively. TEL concepts are deﬁned by the following grammar:

C, D ::= A | C u D | 9r.C | C | 3 C,

where A 2 NC, r 2 NR, and 2 {F, P}. A TEL-TBox (ontology) T is a ﬁnite set of concept inclusions (CIs) C v D and concept deﬁnitions (CDs) C D for TEL concepts C, D. Data is given in terms of temporal ABoxes A, which are ﬁnite sets of assertions of the form A(a, n) and r(a, b, n), where A 2 NC, r 2 NR, a, b 2 NI, and n 2 Z. We denote by ind(A) the sets of individual names occurring in A, and by tem(A) the set {n 2 Z | min A n max A}, where min A and max A are, respectively, the minimal and maximal time points in A. The size, |T | and |A|, of T and A is the number of

symbols required to write T and A, respectively, with time points n 2 Z encoded in unary. A temporal knowledge base (KB) K is a pair (T , A).

An interpretation I is a structure ( I, (In)n2Z), where each In is a classical DL interpretation with domain I: we have AIn I and r In I I. Rigid roles r 2 Nrig

R do not change their interpretation in time: r In = r I0 for all n 2 Z. We usually write AI,n and r I,n instead of AIn and r In, respectively, and the mapping I,n is extended to complex TEL-concepts as follows:

(C u D)I,n = CI,n \ DI,n,

(9r.C)I,n =

d | there is e 2 CI,n with (d, e) 2 r I,n

( C)I,n = CI,n op 1,

d | d 2 CI,n op k for some k > 0

, where op stands for if = P and for + if = F. We use strict 3 (k > 0) but our results do not depend on the choice.

An interpretation I is said to be a model of C v D, written I |= C v D, if CI,n DI,n, for all n 2 Z; and a model of C D if CI,n = DI,n, for all n 2 Z. We call I a model of a TBox T , written I |= T , if I |= for all 2 T . Note that TBoxes are interpreted globally in the sense that all CIs and CDs must be satisﬁed at every time point. A concept D subsumes a concept C with respect to T , written T |= C v D, if I |= C v D for all models I of T .

For ABoxes A we adopt the standard name assumption: a I,n = a for all a 2 ind(A), n 2 Z (and thus ind(A) I). The relation |= is extended to ABoxes by taking I |= A(a, n) iff a 2 AI,n and I |= r(a, b, n) iff (a, b) 2 r I,n; I is a model of A iff I |= for all 2 A. An interpretation I is a model of a KB (T , A), written I |= (T , A), iff I |= T and I |= A. Finally, K |= A(a, n) if I |= A(a, n) for every model I of K.

As the query language, we consider temporal atomic queries (TAQs) of the form A(x, t) with A 2 NC, x an individual variable and t a temporal variable. Given K = (T , A), a certain answer to A(x, t) over K is a pair (a, n) 2 ind(A) tem(A) with K |= A(a, n). We study the complexity of the query answering problem over temporal knowledge bases:

TAQ answering Input: TBox T , ABox A, TAQ A(x, t), a pair (a, n). Question: Is (a, n) a certain answer to A(x, t) over (T , A)?

Our results concern both the combined and data complexity of the problem: for data complexity, the TBox is ﬁxed. As usual, for a complexity class C and a class X of TBoxes, we say that TAQ answering over X is C-hard in data complexity if there

is some T 2 X such that answering TAQs over T is C-hard. Conversely, TAQ answering over X is in C in data complexity if, for all T 2 X, answering TAQs over T is in C.

As classes X, we will in particular look at full TEL and its fragments TEL3 and TEL in which, respectively, only the temporal operators 3 and are allowed. Note that 3 on the left-hand side and 2 (with the usual semantics) on the right-hand side of CIs can be expressed in TEL , e.g., instead of 3PA v X or, equivalently, A v 2FX, take A v A0 and PA0 v A0 u X, for a fresh A0. Thus, rigid concepts, which do not change their interpretation in time, can be expressed in these two fragments using 3P3F on the left-hand side of CIs.

3 Query Answering in TEL: Undecidability We ﬁrst pinpoint different sources of complexity for the query answering problem in TEL in order to identify computationally well-behaved fragments in the sequel.

We begin by showing that TAQ answering over TEL3 is undecidable. The known undecidability of subsumption in TEL3 [Artale et al., 2007] translates only into the combined complexity of TAQ answering. We strengthen the result to obtain undecidability in data complexity by reducing the halting problem for the universal Turing machine. We exploit the crucial observation that disjunction, although not in the syntax, can be simulated with 3 [Artale et al., 2007]. Theorem 1 TAQ answering over TEL3 is undecidable in data complexity. The proof can also be adapted to the non-strict semantics of 3 using the chessboard technique [Gabbay et al., 2003].

TEL , unlike TEL3, is not capable of expressing disjunction. Still, its expressiveness makes answering TAQs hard: Theorem 2 TAQ answering over TEL is non-elementary in combined complexity and PSPACE-hard in data complexity. The proof of PSPACE-hardness is close in spirit to that for

DATALOG1S [Chomicki and Imielinski, 1988]; note that the lower bound holds even in the restriction of TEL without 9r.C on the right-hand side of CIs. For the non-elementary lower bound, we take inspiration in the construction for the product modal logic LTL K [Gabbay et al., 2003, Theorem 6.34]. Our proof requires a careful implementation of the yardstick technique [Stockmeyer, 1974] with only Horn formulas.

Decidability of TAQ answering in full TEL is left open as interesting and challenging future work; more insights on the difﬁculty of the problem are given in Sec. 4. Nevertheless, we show that extending TEL with certain DL constructs that are harmless for data complexity of atemporal query answering [Krisnadhi and Lutz, 2007] immediately leads to undecidability. Let TELI and TELF be the extensions of TEL with inverse roles r and functionality axioms func(r), respectively.1 For both languages, we reduce the halting problem for the universal Turing machine to prove: Theorem 3 TAQ answering over TELI and TELF is undecidable in data complexity. In the rest of the paper, we study decidability and complexity of TAQ answering in various fragments of TEL and TEL3.

4 Foundations of Query Answering in TEL

In this section, we lay the groundwork for the development of algorithms for query answering in fragments of TEL by introducing canonical quasimodels, which are succinct abstract representations of the universal model of the KB, see also [Artale et al., 2013b; 2015]. They can also be viewed as a generalization of the canonical structures used for query answering in pure EL [Lutz, Toman, and Wolter, 2009].

In the sequel, we assume that TEL -TBoxes are in normal form, that is, they consist of CIs of the form

A u A0 v B, A v 9r.B, X v A,

1with the usual semantics: (r )I,n = {(e, d) | (d, e) 2 r I,n}; I |= func(r) iff e1 = e2, for all (d, e1), (d, e2) 2 r I,n and n 2 Z.

where A, A0 and B are concept names and X is a basic concept of the form A, A, or 9r.A, for a concept name A. Observe that, without loss of generality, is restricted to the left-hand side of CIs: for instance, A v FB is equivalent to PA v B. It is routine to show that every TEL -TBox can be transformed into the normal form by introducing fresh concept names; see, e.g., [Baader, Brandt, and Lutz, 2005].

Fix now a KB (T , A) with a TEL -TBox T in normal form and let CN be the set of concept names in (T , A). A map : Z ! 2CN is a trace for T if it satisﬁes the following: (t1) if A u A0 v B 2 T and A, A0 2 (n), then B 2 (n); (t2) if A v B 2 T and A 2 (n), then B 2 (n op 1). Traces are the building blocks of quasimodels: they represent the temporal evolution of individual domain elements. For example, for T = { PC v B, PB v C}, the map such that (i) is {B} for odd i and {C} for even i is a trace for T .

In order to describe interactions of domain elements, we require more notation. Let be a trace for T . For a rigid role r 2 Nrig

R , the r-projection of is a map projr( ): Z ! 2CN that sends each i 2 Z to {A | 9r.B v A 2 T , B 2 (i)}; for a local role r 2 Nloc

R , projr( ) is deﬁned in the same way on 0 but is ; for all other i 2 Z. Given a map %: Z ! 2CN and n 2 Z, we say that contains the n-shift of % and write % n if %(i n) (i), for all i 2 Z. For example, let T = {9r.B v B0} with rigid role r. In the picture below, the trace a contains the 1-shift of the r-projection of B:

0 1 -1 2 3 4

If r is local then a has to contain B0 only at 1 (but not at 3, etc.). We are now fully equipped to deﬁne quasimodels.

Let D = ind(A) [ CN henceforth. A quasimodel Q for (T , A) is a set of traces d, d 2 D, for T such that (q1) A 2 a(n), for all A(a, n) 2 A; (q2) B 2 B(0), for all B 2 CN; (q3) projr( b) 0 a, for all r(a, b, n) 2 A; (q4) if A 2 d(n) then projr( B) n d, for all d 2 D,

n 2 Z and A v 9r.B in T . Intuitively, quasimodels represent models of (T , A): each a stands for the ABox individual a; each B, on the other hand, represents all individuals that witness B for CIs A v 9r.B in T . The latter is, in fact, the crucial abstraction underlying quasimodels. Note that traces B are normalized: B occurs at time point 0, which is compensated by the shift operation in (q4). For example, in the picture above, if A v 9r.B 2 T then, in any model, a has an r-successor that belongs to B at moment 1. Such a successor can be obtained as a copy of trace B shifted by 1 so that its origin, 0, matches moment 1 for a. Then, by (q4), a belongs to B0 at all odd moments.

For the purposes of query answering we need to identify canonical (minimal) quasimodels. We deﬁne the canonical quasimodel as the limit of the following saturation (chase-like) procedure. Start with initially empty maps d, for d 2 D, and apply (t1) (t2), (q1) (q4) as rules: (q3), for example, says if

r(a, b, n) 2 A and A 2 projr( b)(i), then add A to a(i). Then we have the following characterization:

Theorem 4 Let Q = { d | d 2 D} be the canonical quasimodel of (T , A) with T a TEL -TBox. Then, for any A 2 CN,

(T , A) |= A(a, i) iff A 2 a(i), for a 2 ind(A), i 2 Z.

The procedure for constructing the canonical quasimodel deals with inﬁnite data structures (traces) and is generally not terminating. So, although Theorem 4 provides a criterion for certain answers, it does not immediately yield a decision algorithm for full TEL . We remark that known techniques for dealing with such inﬁnite structures cannot be easily applied: for example, MSO (over Z), a standard tool for decidability proofs in temporal DLs [Gabbay et al., 2003], is not sufﬁcient to encode the canonical quasimodel directly because (q4) requires +. In fact, the key to showing decidability for (fragments of) TEL is ﬁnding a ﬁnite representation of traces.

The starting point of the rest of the paper is a semantic condition on the canonical quasimodel, ultimate periodicity, which ensures decidability in data complexity. Let T be a TEL -TBox and Q the canonical quasimodel for (T , ;). We say that T is ultimately periodic, if there is p 2 N such that all B, B 2 CN, in Q are ultimately p-periodic, that is, for each B 2 CN, there are positive integers m P, p P, m F, p F p satisfying the following conditions:

B(n p P) = B(n), for all n m P, B(n + p F) = B(n), for all n m F.

Intuitively, an ultimately p-periodic trace has repeating sections on the left and on the right:

m F +p F m P p P

m F +2p F m P 2p P The condition of ultimate periodicity is rather natural. On the practical side, it is motivated by applications with recurrent patterns such as health care support [Shankar et al., 2008], see CIs (1) and (2) in Section 1. From the theoretical point of view, any satisﬁable LTL formula has an ultimately periodic model [Manna and Wolper, 1984].

We next show that ultimate periodicity is indeed sufﬁcient for decidability in data complexity.

Theorem 5 TAQ answering over ultimately periodic TEL - TBoxes is PSPACE-complete in data complexity.

PSPACE-hardness follows from (the proof of) Theorem 2. We prove the matching upper bound by rewriting an ultimately periodic TEL -TBox T into DATALOG1S [Chomicki and Imielinski, 1988]. First, we take temporal rules

r(x, y, t 1) r(x, y, t), for r 2 Nrig

R in T , (3)

B(x, t) A(x, t), A0(x, t), for A u A0 v B in T , (4) B(x, t) r(x, y, t), A(y, t), for 9r.A v B in T , (5)

which reﬂect rigidity of roles and standard EL concept inclusions on ABox individuals. Second, we observe that, for any trace d in the canonical quasimodel Q of any (T , A), if A 2 d(n) and A v 9r.B 2 T then, by (q4), d contains

the n-shift not only of projr( B) but also of A. Since T is ultimately periodic, for each trace B, we ﬁx integers m P, p P, m F, p F and take the following rules with a fresh predicate FB:

A(x, t + i) B(x, t), for 0 i < m F, A 2 B(i), A(x, t + i) FB(x, t), for 0 i < p F, A 2 B(m F + i), FB(x, t + m F) B(x, t), FB(x, t + p F) FB(x, t),

and symmetric rules with m P, p P and fresh PB. Intuitively, the rules in the ﬁrst line replicate the (irregular) part of B from 0 to m F. The two rules in the last line add recurring markers FB at the start of each period while the rules in the second line replicate the period of B starting from each marker FB.

The required DATALOG1S-program T contains all the rules above (note that CIs of the form A v B are also covered by the rules for traces B). Using the canonical quasimodel and Theorem 4, it is readily seen that T is equivalent to T : for every temporal ABox A, the answers to T over A coincide with the certain answers to (T , A). Theorem 5 follows from PSPACE data complexity in DATALOG1S [Chomicki and Imielinski, 1988] and independence of T from A.

Observe that Theorem 5 does not imply decidability of full TEL since it is open whether every TEL -TBox is ultimately periodic. We thus turn our attention to sufﬁcient syntactic conditions for ultimate periodicity and obtain tight complexity bounds for both data and combined complexity for the resulting fragments. We consider two types of conditions: restricted use of rigid roles and acyclicity of concept inclusions.

5 Restricted Use of Rigid Roles

We ﬁrst consider TEL

loc, the restriction of TEL in which only local roles are allowed. Due to the reduced interaction between the temporal and DL component, we obtain data tractability. Theorem 6 TAQ answering over TEL

loc is PSPACE-complete in combined and PTIME-complete in data complexity. PSPACE-hardness follows from the proof of PSPACE-hardness for entailment in Horn-LTL [Chen and Lin, 1993] and PTIMEhardness from atomic query answering in EL. For the upper bounds, let (T , A) be a KB with T a TEL

loc-TBox and Q its canonical quasimodel. We take a propositional variable PA,d for each A 2 CN and d 2 D and construct a Horn LTL formula 'T ,A whose minimal model is isomorphic to Q: variable PA,d is true in the model at moment n iff A 2 d(n). We take the conjunction of the following formulas, for d 2 D:

2(PA,d PA0,d ! PB,d), for A u A0 v B 2 T , 2( PA,d ! PB,d), for A v B 2 T ,

n PA,a, for A(a, n) 2 A, PB,B, for B 2 CN,

n PB,b ! n PA,a, for r(a, b, n)2A, 9r.B v A2T ,

PB0,B !2(PA,d !PA0,d), for A v 9r.B, 9r.B0 v A0 2T ,

where n is n

F if n 0 and n

P if n < 0 and 2 is the globally operator. It is readily veriﬁed that 'T ,A is as re-

quired. Crucially, (q4) for local roles boils down to the last formula above. Since entailment in LTL is in PSPACE [Sistla and Clarke, 1985] and 'T ,A is polynomial in the size of (T , A), we obtain membership in PSPACE for combined complexity.

To obtain the PTIME data complexity, observe that traces B, B 2 CN, are ultimately 2|T |-periodic because they are traces of the canonical quasimodel for (T , ;); so, they can be maintained in constant space. Next, traces a, a 2 ind(A), are ultimately 2|T |+|A|-periodic, but a closer inspection reveals that the middle irregular section, m P + m F, is bounded by |A| + 2|T |, while both periods, p P and p F, by 2|T |; see, e.g., Lemma 3 [Artale et al., 2013a]. Thus, Q can be stored in space bounded by a polynomial in |A|. Since each rule application extends the traces, the saturation procedure for constructing Q terminates in polynomial time in the size of A.

Since ontologies without rigid roles at all may be too restrictive for applications, we consider TEL

l-rig-TBoxes where rigid roles are allowed only in CIs of the form 9r.B v A. Theorem 7 TAQ answering over TEL

l-rigis PSPACE-complete in data complexity and in EXPTIME in combined complexity. PSPACE-hardness in data complexity follows from the proof of Theorem 2. For the upper bounds, we construct rewritings into DATALOG1S, similarly to T in Section 4 (Theorem 5).

6 Acyclicity Conditions It is known that acyclicity conditions may lead to better complexity. In particular, acyclic TBoxes are a way of obtaining CTL-based temporal extensions of EL that have rigid roles and enjoy PTIME subsumption [Guti errez-Basulto, Jung, and Schneider, 2015]. In DATALOG1S, a restriction on recursion has also been used to attain tractability [Chomicki, 1990]. From the application point of view, large parts of SNOMED CT and GO [Gene Ontology Cons., 2000] are indeed acyclic. So, we believe that the fragments we consider below are wellsuited for temporal extensions of such ontologies.

Acyclic TBoxes are ﬁnite sets of CDs A C, A 2 NC, such that no two CDs have the same left-hand side, and there are no CDs A1 C1, . . . , Ak Ck in T such that Ai+1 occurs in Ci, for all 1 i k (where Ak+1 := A1). We say A is deﬁned in T if A C 2 T and primitive otherwise. Theorem 8 TAQ answering over acyclic TEL is in LOGTIME-uniform AC0 in data complexity and in PTIME in combined complexity. The LOGTIME-uniform AC0 upper bound is established by rewriting into FO(+): for a given TAQ A(x, t) and TBox T , we construct a two-sorted ﬁrst-order formula 'T ,A(x, t) with functions +1 and 1 on temporal terms such that (T , A) |= A(a, i) iff A (viewed as an interpretation) is a

model of 'T ,A(a, i), for all ABoxes A, a 2 ind(A), i 2 Z. We adapt the technique developed for atemporal EL [Bienvenu, Lutz, and Wolter, 2012]:

'T ,A(x, t) = SA(x, t), if A is primitive, 'T ,A(x, t) = SA(x, t) _ 'T ,C(x, t), if A C 2T , 'T ,B1u B2(x, t) = 'T ,B1(x, t) 'T ,B2(x, t),

'T ,9r.B(x, t) = 9y

Rr(x, y, t) 'T ,B(y, t)

, 'T , B(x, t) = 'T ,B(x, t op 1),

where SA(x, t) is a disjunction of all B(x, t) for concept names B with T |= B v A, and Rr(x, y, t) is r(x, y, t) for r 2 Nloc

R and 9t0 r(x, y, t0) for r 2 Nrig

Note that 'T ,A is an FOZ-rewriting in the terminology of Artale et al. [2013b; 2015] because the temporal terms range over Z. However, the inﬁnite interpretation of A is empty after at most |T | steps from the ABox and so, 'T ,A can be converted into an FO-rewriting whose temporal terms range over tem(A) only; see [Artale et al., 2015].

We next introduce novel notions of acyclicity that restrict only one dimension, DL or temporal.

DL Acyclicity First, we introduce DL-acyclic TEL -TBoxes, which are wellsuited as temporal extensions of, say, biomedical ontologies that may require recurrent patterns but have an acyclic DL component. A TEL -TBox T with concept names CN is called DL-acyclic if there is a mapping DL : CN ! N such that:

(i) Av9r.B or 9r.B v A2 T implies DL(A) > DL(B); (ii) A v B implies DL(A) = DL(B); (iii) A u A0 v B 2 T implies DL(A) = DL(A0) = DL(B). A DL-acyclic TBox is of depth k if k is minimal such that a witnessing mapping DL satisﬁes DL(B) k for all B 2 CN.

Theorem 9 TAQ answering over DL-acyclic TEL -TBoxes of depth k 1 is k-EXPSPACE-complete in combined complexity and NC1-complete in data complexity.

A closer inspection of the non-elementary lower bound proof in Theorem 2 reveals that the TBox used is DL-acyclic and TAQ answering over TBoxes of depth k is k-EXPSPACE-hard. NC1-hardness in data complexity follows [Artale et al., 2015] by reduction of the word problem of NFAs to TAQ answering, even without the DL dimension.

For the matching upper bounds, ﬁx (T , A) with T of depth k. We devise a completion procedure, which is based on special LTL-formulas and implies ultimate periodicity of all traces in the canonical quasimodel of (T , A); cf. Section 5. Given any A, let the slice Ai consist of all A(a, i) 2 A, all r(a, b, i) 2 A and all r(a, b, i) with r(a, b, j) 2 A, for some j, and r 2 Nrig

R . The algorithm separates consequences of the role structure of A and local temporal consequences of T . In particular, it exhaustively extends A by all A(a, i) with either

(T , Ai)|=A(a, i) or B(a, i op 1)2A, B v A2T . (6)

It turns out that Ai in (6) can be replaced by its suitably deﬁned quotient Bi. Intuitively, the logic can only distinguish distinct trees of depth k, whose number depends on |T | only; so, the size of Bi is independent of |A|. By induction on depth k, we deﬁne LTL-formulas 'a,i of k-fold-exponential size characterizing all A 2 CN with (T , Bi) |= A(a, i): we begin from formulas as in Theorem 6; the induction step takes account of the structure of Bi and incurs an exponential blowup.

Now, for combined complexity, observe that each of the polynomially many 'a,i can be analyzed in k-EXPSPACE. For data complexity, observe that checking (T , Bi) |= A(a, i) can be done in constant time. The second option in (6), however, cannot be implemented directly as the number of steps depends on |A|. Instead, we construct a B uchi automaton that accepts precisely the traces for T and cast the second option in (6) as the question of whether all traces extending A have A at position i, which is a regular property and so, is in NC1.

Temporal Acyclicity We next relax acyclicity by admitting recursion in the DL dimension (but not in temporal); thus, temporally acyclic TBoxes include general EL-TBoxes. A TEL -TBox T with concept names CN is temporally acyclic if there is : CN ! N such that:

(i) PAv B or FB v A2T implies (B)= (A)+1; (ii) 9r.B v A or A v 9r.B 2 T implies (A) = (B); (iii) A u A0 v B 2 T implies (A) = (A0) = (B). Temporally acyclic TBoxes cannot, unlike DL acyclic ones, express rigid concepts. Still, we can partition concept names NC into local Nloc

C and rigid Nrig

C and obtain the following:

Theorem 10 TAQ answering over temporally acyclic TEL (with rigid concepts) is PTIME-complete in data and combined complexity. The lower bounds are from EL. For the upper bounds, we show a small quasimodel property: traces of the canonical quasimodel of any (T , A) with such a TBox T satisfy

a(j) = a(j0), if j, j0 > u + |T | or j, j0 < l |T |,

B(j) = B(j0), if j, j0 > |T | or j, j0 < |T |,

where u = max A and l = min A. Intuitively, the canonical quasimodel has a restricted temporal extension that stretches only |T | time points beyond A. By the small quasimodel property, the procedure for constructing the canonical quasimodel can be implemented in polynomial time: traces d require only polynomial space, and rules (q1) (q4) extend the traces.

Inﬂationary TEL3

Next, we follow an approach suggested by Artale et al. [2013b] (in the context of temporal DL-Lite) and restrict TEL3 by allowing 3 only on the left-hand side of CIs. This fragment is denoted by TEL3

inﬂ, for inﬂationary TEL (which is related to inﬂationary DATALOG1S [Chomicki, 1990]). Note that TEL3

inﬂ extends general EL-TBoxes. Yet, the complexity is the same: Theorem 11 TAQ answering over TEL3

inﬂis PTIME-complete in both data and combined complexity. We need to show only the upper bounds. Observe that TEL3

inﬂ can still be viewed as a fragment of TEL ; see Section 2. In fact, one can show an analogue of Theorem 4 with the following replacement of (t2): (t20) if 3 A v B 2 T and A 2 d(n), then B 2 d(n0) for

all n0 > n if = P and for all n0 < n if = F. We establish a special shape of the traces in the canonical model of any (T , A). Let %: Z ! 2CN be a map and let l, u 2 Z with l u. We say that % is an [l, u]-bow tie if

for all i > u, we have %(i + 1) %(i), and if %(i + 1) =

%(i) then all %(i0), for n0 i, coincide; symmetrically, for all i < l, we have %(i 1) %(i),

and if %(i 1) = %(i) then all %(i0), for i0 i, coincide. These properties mean that % grows monotonically to the right of u and to the left of l; in other words, % has inﬂationary behaviour. We prove that the traces d in the canonical quasimodel Q of (T , A), for any A, enjoy the following properties:

a is a [min A, max A]-bow tie, for each a 2 ind(A); B is a [0, 0]-bow tie, for each B 2 CN. Thus, the traces in Q can be represented in polynomial space because only the middle section and at most |CN| steps at both ends need to be stored. Since the traces are extended with every rule application, the procedure terminates after polynomially many steps; Theorem 11 follows.

7 Discussion and Future Work We summarize the fragments of TEL, their relationships and the obtained complexity results in the following diagram:

undecidable TEL

non-elem TEL3

undecidable ultim. period. TEL

inﬂ PTIME DL-acyclic TEL

temp. acyclic TEL

l-rig in EXPTIME

loc PSPACE acyclic TEL

in PTIME EL PTIME acyclic EL

in PTIME where the solid lines are inclusions between DLs, the dashed line is a reduction that preserves answers to all queries (model conservative extension). The data complexity is indicated by shading and the combined complexity is below the language.

Our data-tractability results show theoretical adequacy of the identiﬁed fragments of TEL for data-intensive applications. Our two novel forms of acyclicity, DLand temporal, are somewhat close in spirit to multi-separability [Chomicki, 1990]: the latter, however, puts a weaker restriction on recursion but a stricter one on the interaction between the temporal and data component. DL-acyclic TEL is the ﬁrst (to the best of our knowledge) DL shown to have NC1-complete query answering (the large gap between data and combined complexity is also remarkable). On the practical side, there is evidence that such data-tractable fragments should be sufﬁcient for many biomedical applications. Following the principles of OBDA, our framework provides a means of deﬁning temporal concepts in the ontology for these applications: temporal concepts capture both (restricted) tree-shaped temporal conjunctive queries (CQs) and recurring temporal patterns.

As our immediate future work, we will address decidability of (full) TEL and then consider CQs with the + operation on temporal terms. We expect that our positive results can be lifted to CQs using the combined approach [Lutz, Toman, and Wolter, 2009], which utilizes a structure similar to our canonical quasimodel. We will also study succinct and expressive representations of temporal data. For example, the only known algorithm for DATALOG1S with binary encoding of timestamps in the data runs in EXPTIME in the size of the data [Chomicki and Imielinski, 1988]. We, however, presume that careful materialization should be sufﬁcient to deal with the issue. We will also consider interval encoding of temporal ABoxes, e.g., A(a, [n1, n2]), and settings capturing inﬁnite temporal periodic data as introduced by Kabanza, St evenne, and Wolper [1990] and Chomicki and Imielinski [1993].

[Artale et al., 2007] Artale, A.; Kontchakov, R.; Lutz, C.;

Wolter, F.; and Zakharyaschev, M. 2007. Temporalising tractable description logics. In Proc. TIME-07, 11 22. [Artale et al., 2013a] Artale, A.; Kontchakov, R.; Ryzhikov,

V.; and Zakharyaschev, M. 2013a. The complexity of clausal fragments of LTL. In Proc. LPAR-19, 2013, 35 52. [Artale et al., 2013b] Artale, A.; Kontchakov, R.; Wolter, F.;

and Zakharyaschev, M. 2013b. Temporal description logic for ontology-based data access. In Proc. IJCAI-13. [Artale et al., 2015] Artale, A.; Kontchakov, R.; Kovtunova,

A.; Ryzhikov, V.; Wolter, F.; and Zakharyaschev, M. 2015. First-order rewritability of temporal ontology-mediated queries. In Proc. IJCAI-15, 2706 2712. [Baader, Borgwardt, and Lippmann, 2015] Baader, F.; Borg-

wardt, S.; and Lippmann, M. 2015. Temporal query entailment in the description logic SHQ. J. Web Sem. 33:71 93. [Baader, Brandt, and Lutz, 2005] Baader, F.; Brandt, S.; and

Lutz, C. 2005. Pushing the EL envelope. In Proc. IJCAI-05. [Baudinet, Chomicki, and Wolper, 1993] Baudinet, M.; Chomicki, J.; and Wolper, P. 1993. Temporal deductive databases. In Temporal Databases. Benjamin/Cummings. [Bienvenu, Lutz, and Wolter, 2012] Bienvenu, M.; Lutz, C.;

and Wolter, F. 2012. Deciding FO-rewritability in EL. In Proc. DL-12. [Borgwardt and Thost, 2015] Borgwardt, S., and Thost, V.

2015. Temporal query answering in the description logic EL. In Proc. IJCAI-15, 2819 2825. [Borgwardt, Lippmann, and Thost, 2015] Borgwardt, S.; Lippmann, M.; and Thost, V. 2015. Temporalizing rewritable query languages over knowledge bases. J. Web Sem. 33:50 70. [Chen and Lin, 1993] Chen, C.-C., and Lin, I.-P. 1993. The

computational complexity of satisﬁability of temporal Horn formulas in propositional linear-time temporal logic. Inf. Process. Lett. 45(3):131 136. [Chomicki and Imielinski, 1988] Chomicki, J., and Imielin-

ski, T. 1988. Temporal deductive databases and inﬁnite objects. In Proc. PODS-88, 61 73. [Chomicki and Imielinski, 1993] Chomicki, J., and Imielin-

ski, T. 1993. Finite representation of inﬁnite query answers. ACM Trans. Database Syst. 18(2):181 223. [Chomicki and Toman, 2005] Chomicki, J., and Toman, D.

2005. Temporal Databases. In Handbook of Temporal Reasoning in Artiﬁcial Intelligence, 429 468. Elsevier. [Chomicki, 1990] Chomicki, J. 1990. Polynomial time query

processing in temporal deductive databases. In Proc. PODS. [Dong and Tan, 2015] Dong, X. L., and Tan, W. 2015. A time

machine for information: Looking back to look forward. PVLDB 8(12):2044 2055. [Gabbay et al., 2003] Gabbay, D.; Kurucz, A.; Wolter, F.; and

Zakharyaschev, M. 2003. Many-dimensional modal logics: theory and applications. Elsevier.

[Gene Ontology Cons., 2000] Gene Ontology Cons. 2000.

Gene ontology: Tool for the uniﬁcation of biology. Nature Genetics 25:25 29. [Guti errez-Basulto and Klarman, 2012] Guti errez-Basulto,

V., and Klarman, S. 2012. Towards a unifying approach to representing and querying temporal data in description logics. In Proc. RR-12, 90 105. [Guti errez-Basulto, Jung, and Lutz, 2012] Guti errez-

Basulto, V.; Jung, J. C.; and Lutz, C. 2012. Complexity of branching temporal description logics. In Proc. ECAI-12. [Guti errez-Basulto, Jung, and Schneider, 2014] Guti errez-

Basulto, V.; Jung, J.; and Schneider, T. 2014. Lightweight description logics and branching time: a troublesome marriage. In Proc. KR-14. [Guti errez-Basulto, Jung, and Schneider, 2015] Guti errez-

Basulto, V.; Jung, J.; and Schneider, T. 2015. Lightweight temporal description logics with rigid roles and restricted TBoxes. In Proc. IJCAI-15, 3015 3021. [Haase and Lutz, 2008] Haase, C., and Lutz, C. 2008. Com-

plexity of subsumption in the EL family of description logics: Acyclic and cyclic TBoxes. In Proc. of ECAI-08. [Kabanza, St evenne, and Wolper, 1990] Kabanza, F.; St evenne, J.; and Wolper, P. 1990. Handling inﬁnite temporal data. In Proc. PODS-90, 392 403. [Klarman and Meyer, 2014] Klarman, S., and Meyer, T. 2014.

Querying temporal databases via OWL 2 QL. In Proc. RR14, 92 107. [Krisnadhi and Lutz, 2007] Krisnadhi, A., and Lutz, C. 2007.

Data complexity in the EL family of description logics. In Proc. LPAR-07, 333 347. [Lutz, Toman, and Wolter, 2009] Lutz, C.; Toman, D.; and

Wolter, F. 2009. Conjunctive query answering in the description logic EL using a relational database system. In Proc. IJCAI-09. [Manna and Wolper, 1984] Manna, Z., and Wolper, P. 1984.

Synthesis of communicating processes from temporal logic speciﬁcations. ACM Trans.Program.Lang.Syst. 6(1):68 93. [O Connor et al., 2009] O Connor, M. J.; Shankar, R. D.; Par-

rish, D. B.; and Das, A. K. 2009. Knowledge-data integration for temporal reasoning in a clinical trial system. Int. J. Med. Inform. 78:S77 S85. [Roth and Tan, 2013] Roth, M., and Tan, W. 2013. Data

integration and data exchange: It s really about time. In Proc. CIDR-13. [Shankar et al., 2008] Shankar, R. D.; Martins, S. B.; O Connor, M. J.; Parrish, D. B.; and Das, A. K. 2008. Representing and reasoning with temporal constraints in clinical trials using semantic technologies. In BIOSTEC-08, Revised Selected Papers, 520 530. [Sistla and Clarke, 1985] Sistla, A. P., and Clarke, E. M.

1985. The complexity of propositional linear temporal

logics. J. ACM 32(3):733 749. [Stockmeyer, 1974] Stockmeyer, L. J. 1974. The Complexity

of Decision Problems in Automata Theory and Logic. Ph.D. Dissertation, MIT, Cambridge, Massachusetts, USA.