# incremental_event_calculus_for_runtime_reasoning__7cea3a21.pdf

Journal of Artiﬁcial Intelligence Research 73 (2022) 967-1023 Submitted 01/2021; published 03/2022

Incremental Event Calculus for Run-Time Reasoning

Efthimis Tsilionis eftsilio@di.uoa.gr, Department of Informatics & Telecommunications eftsilio@iit.demokritos.gr National and Kapodistrian University of Athens, Greece Institute of Informatics & Telecommunications NCSR Demokritos , Greece

Alexander Artikis a.artikis@unipi.gr Department of Maritime Studies University of Piraeus, Greece Institute of Informatics & Telecommunications NCSR Demokritos , Greece

Georgios Paliouras paliourg@iit.demokritos.gr Institute of Informatics & Telecommunications NCSR Demokritos , Greece

We present a system for online, incremental composite event recognition. In streaming environments, the usual case is for data to arrive with a (variable) delay from, and to be revised by, the underlying sources. We propose RTECinc, an incremental version of RTEC, a composite event recognition engine with formal, declarative semantics, that has been shown to scale to several real-world data streams. RTEC deals with delayed arrival and revision of events by computing all queries from scratch. This is often ineﬃcient since it results in redundant computations. Instead, RTECinc deals with delays and revisions in a more eﬃcient way, by updating only the aﬀected queries. We examine RTECinc theoretically, presenting a complexity analysis, and show the conditions in which it outperforms RTEC. Moreover, we compare RTECinc and RTEC experimentally using real-world and synthetic datasets. The results are compatible with our theoretical analysis and show that RTECinc outperforms RTEC in many practical cases.

1. Introduction

Streaming environments combine simple, derived events (SDEs) in order to recognize in realtime composite events (CEs) that satisfy a given pattern. These patterns are collections of simpler events, which are subject to temporal and atemporal constraints and may be combined with static background knowledge (Cugola & Margara, 2012; Alevizos et al., 2017; Giatrakos et al., 2020).

The Event Calculus for Run-Time Reasoning (RTEC) (Artikis et al., 2015) is a composite event recognition (CER) system that has been tested and proven eﬃcient in numerous applications, such as urban traﬃc management, public space surveillance and maritime situational awareness (Patroumpas et al., 2017). RTEC is a dialect of the Event Calculus, which is a logic programming formalism for representing and reasoning about events and their eﬀects (Kowalski & Sergot, 1986). Moreover, RTEC includes various optimisation techniques for computing the intervals of CEs in a data stream.

c 2022 AI Access Foundation. All rights reserved.

Tsilionis, Artikis, & Paliouras

In real-life streaming tasks, the usual case is for the input events to arrive with variable delays to the CER system. In the maritime domain, for example, delays occur in the input events when the stations (terrestrial and satellite) that collect the position signals of vessels have to deal with a great amount of messages. Furthermore, revisions or retractions may be applied to input events; e.g. the start or end time of an event may be wrong and subsequently corrected by the event source. Delays and retractions in the input stream are being handled by RTEC by means of windowing. RTEC employs overlapping windows in order to wait for delayed events and retractions. A drawback is that CE intervals within a window are computed from scratch, without considering the CE intervals of the previous overlapping windows. This way the intervals of a CE will be re-calculated even if delays and retractions do not have an eﬀect on them. In previous work (Tsilionis et al., 2019a), we presented an incremental version of RTEC, i.e. RTECinc, that overcomes this ineﬃciency for a type of CEs. In the present paper, we extend the incremental algorithm in order to handle all types of CEs supported by RTEC. In particular, the contributions of this paper are the following:

We extend RTECinc by supporting all types of CEs of the language of RTEC.

We provide a detailed theoretical evaluation of RTECinc and show the conditions in which it achieves lower computational complexity compared to RTEC.

We compare RTECinc and RTEC experimentally using real-world and synthetic datasets from diﬀerent application domains. The results are compatible with our complexity analysis and show that RTECinc is preferable over RTEC in many practical cases. The code of RTECinc is publicly available1. Moreover, some of the employed datasets are available2, allowing the reproducibility of our results.

The structure of the paper is as follows: Section 2 summarises the functionality of RTEC. Sections 3 and 4 elaborate on the algorithmic details of the incremental procedure and the complexity analysis. Section 5 presents our empirical analysis, while Section 6 discusses related work. Finally, in Section 7 we summarise the work and outline future work directions.

2. Background: Run-Time Event Calculus

2.1 Language

The time model used by RTEC is linear and includes integer time-points (Artikis et al., 2015). If F is a ﬂuent a property that is allowed to have diﬀerent values at diﬀerent points in time the term F=V denotes that ﬂuent F has value V . Boolean ﬂuents are a special case in which the possible values are true and false. holds At(F=V, T) is a predicate representing that ﬂuent F has value V at time-point T. holds For(F=V, I) represents that I is the list of maximal intervals for which F=V holds continuously. holds At and holds For are deﬁned in such a way that, for any ﬂuent F, holds At(F=V, T) if and only if T belongs to one of the maximal intervals of I for which holds For(F=V, I).

1. https://github.com/eftsilio/Incremental_RTEC 2. The datasets concerning maritime activity in the area of Brest, France, are public and can be downloaded through the link provided in the github repository of RTECinc.

Incremental Event Calculus for Run-Time Reasoning

Table 1: Main predicates of RTEC.

Predicate Meaning

happens At(E, T) Event E occurs at time T

holds At(F=V, T) The value of ﬂuent F is V at time T

holds For(F=V, I) I is the list of maximal intervals for which F=V holds continuously

initiated At(F=V, T) At time T the simple ﬂuent F=V is initiated

terminated At(F=V, T) At time T the simple ﬂuent F=V is terminated

relative complement all(I , L, I) I is the list of maximal intervals produced by the relative complement of the list of maximal intervals I

with respect to every list of maximal intervals of list L

union all(L, I) I is the list of maximal intervals produced by the union of the lists of maximal intervals of list L

intersect all(L, I) I is the list of maximal intervals produced by the intersection of the lists of maximal intervals of list L

An event description in RTEC comprises rules that express: (a) event occurrences using the happens At predicate, (b) the eﬀects of events using the initiated At and terminated At predicates, (c) the values of ﬂuents, with the use of the holds At and holds For predicates, as well as other, possibly atemporal, parameters. Table 1 summarises the RTEC predicates available to the event description developer. Variables start with an upper-case letter, while predicates and constants start with a lower-case letter. Event Calculus events express instantaneous SDEs and CEs, while ﬂuent-value pairs express durative SDEs and CEs. In CER, the majority of CEs are durative and, therefore, the task is to compute the maximal intervals for which a ﬂuent-value pair F=V representing a CE has a particular value continuously. A ﬂuent F may have at most one value at each time-point. This constraint is satisﬁed by built-in axioms of RTEC (presented in the Appendix). Fluents in RTEC are of two kinds: simple and statically determined. The incremental computation of the maximal intervals of simple and statically determined ﬂuent-value pairs is the subject of this paper.

2.1.1 Simple Fluents

Simple ﬂuents are deﬁned by means of initiated At and terminated At rules. Below we present an abstract initiated At rule. terminated At rules have similar form.

initiated At(F=V, T)

happens At(A, T),

holds At(B=VB, T),

not happens At(C, T),

not holds At(D=VD, T).

Tsilionis, Artikis, & Paliouras

Rule (1) is a rule of conjunctions, meaning that all body literals should be satisﬁed in order for the rule to ﬁre. not denotes negation by failure (Clark, 1977). The variable T, present at the head and all body literals, expresses that all literals are evaluated at the same timepoint. Rule (1) is satisﬁed at time-point T if event A has occurred at T, there exists an interval of ﬂuent B that includes T, there is no occurrence of event C at T and there is no interval of ﬂuent D that includes T. We use the term positive to refer to events and ﬂuents that must occur at or include T, e.g. A and B, and the term negative to refer to events and ﬂuents that should not occur at or include T (symbol not), e.g. C and D. initiated At and terminated At rules of type (1) are not restricted in the number of body literals. The only requirement is the ﬁrst body literal to be a positive happens At predicate, which can then be followed by a possibly empty set of positive/negative happens At and holds At predicates. An example simple ﬂuent deﬁnition, from the maritime domain (Pitsikalis et al., 2019), is presented below:

initiated At(gap(Vessel)=near Ports, T)

happens At(gap start(Vessel), T),

holds At(within Area(Vessel, near Ports)=true, T).

initiated At(gap(Vessel)=far From Ports, T)

happens At(gap start(Vessel), T),

not holds At(within Area(Vessel, near Ports)=true, T).

terminated At(gap(Vessel)=near Ports, T)

happens At(gap end(Vessel), T).

terminated At(gap(Vessel)=far From Ports, T)

happens At(gap end(Vessel), T).

The above set of rules formalises the notion of a communication gap . Communication gaps occur when a vessel is not emitting its position, either due to the absence of a nearby receiving station or on purpose. In maritime situational awareness, communication gaps are important since the vessel s behavior is unknown and may indicate an intention of hiding (e.g. in cases of illegal ﬁshing in a protected area). Notice that there is a distinction between gaps occurring near ports and those occurring in the open sea. The former are usually not signiﬁcant in maritime monitoring. For a succinct formulation, we represent a communication gap as a multi-valued ﬂuent, as opposed to Boolean ﬂuents. gap start and gap end are input events produced by a module annotating vessel position streams in real-time, indicating that a vessel stopped and re-started transmitting its position, respectively (Patroumpas et al., 2017). within Area(Vessel, near Ports) is a durative input event, denoting the periods of time a vessel is near a port. It is produced by online spatial processing of the position signals of the vessels (Santipantakis et al., 2018). According to rule-set (2), a communication gap is initiated for a Vessel if a gap start has occurred near or far from ports, and terminated when a gap end event is detected. RTEC utilises the time-points produced by initiated At and terminated At rules to construct the maximal intervals during which a simple ﬂuent has a particular value continuously. Therefore, to compute the intervals I for which F=V , i.e. holds For(F=V, I), we ﬁnd all time-points Ts at which F=V is initiated by using initiated At rules, and then, for each Ts,

Incremental Event Calculus for Run-Time Reasoning

Table 2: Examples of the three interval manipulation constructs of RTEC.

Interval manipulation IA IB IF construct

union all([IA, IB], IF ) [(5, 20), (26, 30)] [(28, 35)] [(5, 20), (26, 35)]

intersect all([IA, IB], IF ) [(26, 31)] [(21, 26), (30, 40)] [(30, 31)]

relative complement all(IA, [IB], IF ) [(5, 20), (26, 30)] [(1, 4), (18, 22)] [(5, 18), (26, 30)]

we compute the ﬁrst time-point Tf after Ts at which F=V is terminated, by evaluating

terminated At rules. This is an implementation of the law of inertia.

2.1.2 Statically Determined Fluents

In addition to simple ﬂuents, an event description may include domain-speciﬁc holds For rules that deﬁne the values of a ﬂuent F in terms of the intervals of other ﬂuents. These ﬂuents are called statically determined and the rules that deﬁne them make use of interval manipulation constructs see the last three items of Table 1. Consider the following abstract rule:

holds For(F=V, IF )

holds For(A=VA, IA),

holds For(B=VB, IB),

interval manipulation([IA, IB], IF ).

The evaluation of rule (3) will produce the maximal intervals IF of F=V , by combining the maximal intervals IA and IB of A=VA and B=VB according to a given interval manipulation operation. The three interval manipulation constructs of RTEC are union all, intersect all and

relative complement all. Table 2 displays examples of these constructs. A term of the form (Ts, Te) represents the closed-open interval [Ts, Te). IF in union all([IA, IB], IF ) is a list of maximal intervals that includes each time-point that is part of the lists of maximal intervals IA or IB. IF in intersect all([IA, IB], IF ) is a list of maximal intervals that includes each timepoint that is part of the lists IA and IB. Finally, IF in relative complement all(IA, [IB], IF ) is a list of maximal intervals including each time-point of IA that is not part of IB.

Consider in rule (4) an example deﬁnition of a statically determined ﬂuent from the maritime domain. According to rule (4) two vessels are assumed to perform a rendez Vous if they are close to each other and are stopped in the open sea or moving with low speed. Maritime analysts are interested in these cases as they may be indicators of illegal ship-toship transfers. The proximity ﬂuent is a durative input SDE denoting the periods of time in which two vessels are close to each other, and is produced by online spatial processing. low Speed and stopped are simple ﬂuents denoting when a vessel is moving with a low speed or stopped, respectively.

Tsilionis, Artikis, & Paliouras

holds For(rendez Vous(Vessel1,Vessel2)=true, I)

holds For(proximity(Vessel1,Vessel2)=true, Ip),

holds For(low Speed(Vessel1)=true, Il1),

holds For(low Speed(Vessel2)=true, Il2),

holds For(stopped(Vessel1)=far From Ports, Is1),

holds For(stopped(Vessel2)=far From Ports, Is2),

union all([Il1, Is1], I1),

union all([Il2, Is2], I2),

intersect all([I1, I2, Ip], I).

The interval manipulation constructs of RTEC support the following type of speciﬁcation: for all time-points T, F=V holds at T if and only if a Boolean combination of ﬂuent-value pairs holds at T. For many ﬂuents, this is a much more concise (and cheaper) speciﬁcation than the traditional style of Event Calculus representation, i.e. identifying all conditions in which the ﬂuent is initiated and terminated, so that maximal intervals may then be computed using the domain-independent holds For.

2.2 Semantics and Operation

The language of RTEC supports non-monotonic reasoning and favours Clark completion since it assumes a closed-world description, meaning that the information provided for predicates is suﬃcient and necessary. Rules in RTEC are safe , i.e. every variable that appears in the head of the rule or in any negative literal in the body also appears in at least one positive literal in the body. CE deﬁnitions are (locally) stratiﬁed logic programs (Przymusinski, 1987). Stratiﬁcation allows the mapping of all ﬂuent-value pairs F=V and all events to the non-negative integers. At level 0 we ﬁnd the events and statically determined ﬂuents that do not depend on any other events or ﬂuents, and serve as input to the CER system (these are the explicit facts of our knowledge base). Simple ﬂuents are output entities and thus, belong to higher strata. Events and ﬂuents of level n are deﬁned in terms of at least one event or ﬂuent-value of level n 1 and a possibly empty set of events and ﬂuent-values from levels lower than n 1. In other words, we restrict our attention to hierarchical deﬁnitions; in Figure 1 we present an example CE deﬁnition hierarchy from the maritime domain. The restriction to hierarchical deﬁnitions allows RTEC to perform recognition in a bottom-up manner, according to which the ﬂuents and events at the bottom of the hierarchy are processed ﬁrst and their intervals are cached. Subsequently, RTEC processes the events and ﬂuents of the next level and caches their intervals, and reaches level-by-level the top of the hierarchy. This way, when evaluating the rules of level n, the intervals of the ﬂuents and events of the body literals are simply fetched from the cache, thus avoiding unnecessary re-computations. A CER system consumes a stream of SDEs which may not necessarily be temporally ordered, meaning that there may exist a (variable) delay between the time each SDE occurs and the time of arrival at the CER system. Furthermore, there may be revisions and retractions of SDEs. Consider for example the case where the parameters of an SDE were

Incremental Event Calculus for Run-Time Reasoning

within Area

trawl Speed

before Aground

travel Speed

speed Less Than Min

speed Greater Than Max

rendez Vous

unusual Speed

Figure 1: Hierarchy graph of maritime CE deﬁnitions (Pitsikalis et al., 2019). The nodes of the graph express ﬂuents while an arrow from node X to node Y expresses that ﬂuent X is a body literal in the deﬁnition of ﬂuent Y.

originally computed erroneously and subsequently revised, or the case where an SDE was reported by mistake, and the mistake was realized later (Anicic et al., 2012). In RTEC, the CER process involves the computation of the maximal intervals of ﬂuentvalue pairs expressing CEs. This process takes place at speciﬁed query times q1, q2, . . . . The recognition at each qi is performed over the SDEs that fall within a speciﬁed interval, the working memory or window ω. All SDEs outside the window are discarded and not considered during recognition. This means that at each qi the CER depends only on the SDEs that took place in the interval (qi ω, qi]. The size of ω as well as the temporal distance between two consecutive query times the step (qi qi 1) are user-speciﬁed. In order to deal with delays or retractions of SDEs, the user must set ω to be longer than the step, i.e. qi ω < qi 1 < qi. When ω is longer than the step, it is possible that an SDE occurs in the interval (qi ω, qi 1] but arrives at RTEC in (qi 1, qi]; its eﬀects are taken into account at query time qi. Similarly, the eﬀects of SDE retraction will be taken into consideration for SDEs that took place in (qi ω, qi 1] and were revised in (qi 1, qi]. However, information may still be lost. Any SDEs arriving or retracted between qi 1 and qi are discarded at qi if they took place before or at qi ω. At each query time qi, RTEC computes from scratch and stores the intervals of ﬂuentvalue pairs expressing CEs. Figure 2 illustrates the process of computing the initiation points of the ﬂuent-value pair F=V at two consecutive query times with the use of rule (1) (see page 969). In this example, the window ω is longer than the step; this way, we may consider delayed arrivals or retraction of events, as well as ﬂuent intervals calculated or removed at qi and falling in (qi ω, qi 1]. At the upper part of Figure 2 the upward arrows represent the initiation points calculated at the previous query time qi 1 with the use of rule (1). These points are the result of occurrences of event A and intervals of ﬂuent B that include the occurrences of A. Additionally, event C did not occur at these time-points and the intervals of ﬂuent D did not include these time-points.

Tsilionis, Artikis, & Paliouras

At the bottom part of Figure 2, we present the initiation points produced by evaluating rule (1) at qi. Notice the presence of delayed arrivals for events A and C (green dots), of a new interval computed for ﬂuent D (green line), of a retraction of event C (red dot), and of an interval reduction for ﬂuent B. The delayed arrival of event A along with the retraction of event C lead to a new initiation point. Two initiation points that were present at qi 1 are no longer present at qi. The ﬁrst, due to the new interval of ﬂuent D or the diminished interval of ﬂuent B (both have the same eﬀect), and the second due to a delayed arrival of event C. However, three out of the four initiation points calculated at qi are identical among the two query times (see the vertical dashed lines). These initiation points are not aﬀected by delays and retractions, but are re-computed. This is unnecessary and indicates the redundant computations performed by RTEC. Note that the same has to be done for termination points. In large event descriptions, expressing big CE hierarchies (recall e.g. Figure 1), this process can be very expensive. The example displayed in Figure 2 concerns simple ﬂuents,

happens At(A, T), holds At(B=VB, T), not happens At(C, T), not holds At(D=VD, T).

happens At(A, T), holds At(B=VB, T), not happens At(C, T), not holds At(D=VD, T).

Figure 2: Example of initiation point computation. The upper part of the ﬁgure shows the initiation points of ﬂuent-value pair F=V , as deﬁned by rule (1), calculated at the previous query time qi 1, while the bottom part shows the initiation points calculated at qi. Dots represent event occurrences, unlabeled horizontal lines represent ﬂuent intervals and arrows facing upwards represent initiation points. Green dots express event instances that arrived at RTEC at qi. Green lines express ﬂuent intervals that were calculated at qi. Red dots (resp. lines) express event instances (ﬂuent intervals) that were retracted at qi. On the left of each part of the ﬁgure, we display the conditions of rule (1), indicating the type of event/ﬂuent expressed by the corresponding row of dots/lines. For example, the ﬁrst row of dots expresses occurrences of event A, the second row of lines expresses intervals of ﬂuent B, and so on. Vertical dashed lines indicate the initiation points that are common between the two query times.

Incremental Event Calculus for Run-Time Reasoning

Table 3: Notation of incremental simple ﬂuent computation.

Notation Meaning

se Qi The set of starting/initiation and ending/termination points calculated at qi and falling in (qi ω, qi 1]

IQi The set of event occurrences and ﬂuent intervals available at qi and overlapping (qi ω, qi 1]

I+ The set of event occurrences and ﬂuent intervals inserted at qi and overlapping (qi ω, qi 1]

I The set of event occurrences and ﬂuent intervals retracted at qi and overlapping (qi ω, qi 1]

but RTEC suﬀers from similar ineﬃciencies in the case of statically determined ﬂuents. In the following sections we will present methods for the incremental evaluation of the maximal intervals of both types of ﬂuents, addressing these ineﬃciencies.

3. Incremental Evaluation of Simple Fluents

We present RTECinc, an extension of RTEC that includes a process for the incremental computation of the intervals of simple ﬂuents. In what follows, we assume that windows are overlapping, i.e. that qi ω < qi 1 < qi. Moreover, incremental computation concerns the overlap between consecutive windows, i.e. (qi ω, qi 1]. Reasoning in (qi 1, qi] may be performed using RTEC. Recall from Section 2.1.1 that the intervals of a simple ﬂuent are produced on the basis of its initiation and termination points. Notice also that some of the initiation and termination points might not contribute to the intervals of the ﬂuent at qi but might contribute to future intervals. To make this more clear, assume that at the previous query time, qi 1, an ending point, Tf, was calculated, but the absence of a starting point, Ts, prevented the construction of the interval (Ts, Tf]. Furthermore, consider that at qi the delayed arrival of an event gives rise to the starting point Ts < Tf and as result to the interval (Ts, Tf]. Thus, we must store all the initiation and termination points that fall inside the overlap of consecutive windows, i.e. (qi ω, qi 1]. Table 3 summarizes the notation used throughout this section. Algorithm 1 shows the pseudo-code of recognise Simple Fluent, the procedure for incrementally computing and storing the intervals of simple ﬂuents in RTECinc. This procedure comprises several steps, some of which are in common with RTEC and are shown in bold. First, RTECinc retrieves from the computer memory the initiation and termination points se Qi 1 and the maximal intervals IQi 1 of F=V computed at the previous query time qi 1 (see line 1). The second step (line 2 in Algorithm 1) deletes initiation and termination points, calculated at qi 1 and no longer holding at qi. Next, the addition phase (line 3 in Algorithm 1) calculates new initiation and termination points, i.e. points that were not present at qi 1. The details of the deletion and addition steps are presented in the following sub-sections. The time-points that survived the deletion phase and the new ones produced in the addition phase constitute the initiation/termination points of F=V at qi, i.e. se Qi. The next step

Tsilionis, Artikis, & Paliouras

Algorithm 1 recognise Simple Fluent

1: retrieve(F=V, se Qi 1, IQi 1)

2: delete SEpoints(F=V, se Qi 1)

3: compute SEpoints(F=V, se Qi 1, se Qi)

4: make Intervals(se Qi, IQi)

5: symmetric Diﬀerence(IQi, IQi 1, I+, I )

6: assert(F=V, se Qi, IQi, I+, I )

concerns the construction of the intervals of F=V by means of the initiation and termination points. This process is done as in RTEC and leads to the intervals of F=V at qi, IQi (line 4 in Algorithm 1). At the end of this process, RTECinc computes the intervals that were added (I+), and retracted (I ) for F=V at qi. This is necessary because F=V itself may participate, positively or negatively, in the body of initiated At and terminated At rules of a ﬂuent of a higher stratum. For example, consider again the hierarchy displayed in Figure 1 and notice that the intervals of the ﬂuent rendez Vous depend on the intervals of the ﬂuents stopped and low Speed. In order to calculate incrementally the intervals of rendez Vous, we need to know the intervals that were added (I+) and retracted (I ) for each one of the two constituent ﬂuents. This is achieved by calculating the symmetric diﬀerence between the intervals of F=V calculated at qi 1, IQi 1, and the intervals calculated at qi, IQi (line 5 in Algorithm 1). Finally, the computed list of intervals IQi, along with the initiation/termination points calculated at qi, se Qi, and the intervals in I+ and I , are stored (line 6 in Algorithm 1), replacing the ones computed at qi 1. In the following subsections, we delve into the details of the deletion and the addition phases, since these constitute the main diﬀerences between RTECinc and RTEC, with respect to simple ﬂuent recognition.

3.1 Deletion Phase

In the deletion phase, RTECinc examines which of the initiation and termination points falling in the overlapping part of two consecutive windows, (qi ω, qi 1], still hold at qi. An initiation/termination point may no longer hold for several reasons. In the case of negative

happens At literals (see e.g. literal C in rule (1) on page 969), the event may have occurred in the interval (qi ω, qi 1] but arrived in (qi 1, qi]. If the time occurrence of this delayed event coincides with an initiation/termination point calculated at qi 1, the latter must be retracted. In the case of negative ﬂuents (see ﬂuent D in rule (1)), an interval, not present at qi 1, computed at qi and falling in (qi ω, qi 1] may include an initiation/termination point. Again, the initiation/termination point should be removed. In the case of positive events (see event A in rule (1)), the time of occurrence of the event may have been reported by mistake at qi 1, and by qi the mistake may have been realized and the speciﬁc event occurrence retracted. If the time of this event occurrence coincides with the time of an initiation/termination point, then this point should be deleted. Finally, in the case of positive ﬂuents (see ﬂuent B in rule (1)), an interval, calculated at qi 1 and falling in (qi ω, qi 1], that included an initiation/termination point may be deleted or

Incremental Event Calculus for Run-Time Reasoning

reduced. As a consequence, the interval may no longer include the initiation/termination point and once again this point should be removed. To determine which of the initiation/termination points of F=V survive, we use a transformation of the rules expressing CE deﬁnitions. Rule (5) below e.g. is a transformation of rule (1): h initiated At(F=V, T) ise Qi 1 h happens At(A, T) i I h holds At(B=VB, T) i I h happens At(C, T) i I+ h holds At(D=VD, T) i I+ .

Rule (5) is a delta rule determining if an initiation point must be deleted; termination points are handled in a similar manner. As opposed to rule (1), this is a rule of disjunctions. Notice that the negative predicates of rule (1) occur positively in the body of rule (5). The superscripts in the head and the body literals indicate the set in which the time argument T is evaluated on. See Table 3 for the deﬁnitions of these sets. We evaluate rule (5) for each initiation point of se Qi 1, i.e. for each initiation point calculated at qi 1 that falls in (qi ω, qi 1]. Since rule (5) consists of disjunctions, it suﬃces for the time argument of only one of the body literals to match the value of T in the head in order for the rule to be satisﬁed. Rule (5) ﬁres, stating that an initiation point T in se Qi 1 should be deleted, if and only if one of the following conditions is satisﬁed:

(a) T coincides with the time of a retracted occurrence of event A (set I ).

(b) T belongs to a retracted interval of B=VB (set I ).

(c) T matches the timestamp of a delayed arrival of event C (set I+).

(d) T belongs to a new interval of D=VD calculated at qi (set I+), due to a delayed event arrival or retraction.

When all points in se Qi 1 have been examined the deletion phase terminates. Figure 3 illustrates the deletion phase with the use of an example. The evaluation of rule (1) at qi 1 results in the initiation points shown in the upper part of Figure 3. The occurrences of event A belonging to intervals of B=VB, not coinciding with occurrences of event C and not included in the intervals of D=VD, leads to the calculation of these points at qi 1. In the bottom part of Figure 3, rule (5) is used to examine which of these points should be removed. The vertical dashed lines indicate the points not surviving the deletion process. The deletion of each of these points is due to one of the four conditions outlined above. For example, the ﬁrst initiation point is removed because the corresponding occurrence of A was retracted, while the last one was removed due to an interval of D=VD, calculated at qi, that includes this initiation point.

Tsilionis, Artikis, & Paliouras

happens At(A, T), holds At(B=VB, T), not happens At(C, T), not holds At(D=VD, T).

[happens At(A, T)] I v

[holds At(B=VB, T)] I

[happens At(C, T)] I

v [holds At(D=VD, T)] I

Figure 3: Illustration of the deletion phase. The upper part of the ﬁgure shows the initiation points calculated at qi 1 using rule (1). The bottom part displays the deletion process at qi using the delta rule (5). The non-overlapping part of the two query times, (qi 1, qi] is greyed out since the deletion phase concerns only the overlapping part, i.e. (qi ω, qi 1]. Dots represent event occurrences, unlabeled horizontal lines represent ﬂuent intervals and arrows facing upwards represent initiation points. The black color signiﬁes event occurrences and ﬂuent intervals present both at qi 1 and qi, the green color signiﬁes delayed arrival of events and ﬂuent interval computation at qi, while the red color signiﬁes event occurrences and ﬂuent intervals retracted at qi. On the left of each part of the ﬁgure, we display the rule conditions indicating the type of event/ﬂuent expressed by the corresponding row of dots/lines. Enlarged dots and lines denote participation in the deletion process. The vertical dashed lines indicate the initiation points that will be removed.

3.2 Addition Phase

Once the deletion phase has completed, the addition phase commences. The addition phase consists of the calculation of new initiation and termination points, i.e. points that were not present at qi 1. The new time-points may belong to the overlapping part of the two consecutive windows, (qi ω, qi 1], or to the non-overlapping part, (qi 1, qi]. We focus on the overlapping part, since it diﬀerentiates the method of computation of RTECinc from that of RTEC. Computation in the non-overlapping part, (qi 1, qi], is identical in the two algorithms, since initiation/termination points belonging to this part are not aﬀected by delays or retractions.

Consider rule (1) again and assume that at qi 1 the rule did not ﬁre, but all body predicates except the ﬁrst one were satisﬁed at time-point T. At qi a delayed arrival of event A at time-point T will activate the rule. Similarly, deletions of event occurrences or ﬂuent intervals may lead to the satisfaction of a rule. For example, if rule (1) did not ﬁre at

Incremental Event Calculus for Run-Time Reasoning

initiated At(F=V, T) h happens At(A, T) i I+ , h holds At(B=VB, T) i IQi ,

not h happens At(C, T) i IQi ,

not h holds At(D=VD, T) i IQi .

initiated At(F=V, T) h happens At(A, T) i IQi\I+ , h holds At(B=VB, T) i I+ ,

not h happens At(C, T) i IQi ,

not h holds At(D=VD, T) i IQi .

initiated At(F=V, T) h happens At(C, T) i I , h happens At(A, T) i IQi\I+ , h holds At(B=VB, T) i IQi\I+ ,

not h holds At(D=VD, T) i IQi .

initiated At(F=V, T) h happens At(A, T) i IQi\I+ , h holds At(D=VD, T) i I , h holds At(B=VB, T) i IQi\I+ ,

not h happens At(C, T) i IQi I .

qi 1 due to the fact that event C occurred at T, but at qi the speciﬁc occurrence of event C was retracted, the rule would ﬁre. To calculate the new initiation points, we use the delta rules presented in (6) (termination points are handled similarly). The rules in (6) are examined in the given order. The superscripts of these rules correspond to the set in which the time argument T is evaluated and are presented in Table 3. In rule (6)(a), event A is evaluated over the occurrences that arrived at RTECinc at qi (set I+). The time-points in set I+ are examined against all the intervals of B=VB (set IQi) overlapping the interval (qi ω, qi 1]. If an interval of B=VB includes a time-point in set I+, then this time-point should not coincide with any occurrence of event C, and should not overlap any of the intervals of D=VD. If all of these conditions are satisﬁed, then the rule ﬁres and gives rise to a new initiation point. Figure 4(a) shows the calculation of an initiation point belonging to part (qi ω, qi 1] by means of rule (6)(a). Rule (6)(b) is similar to (6)(a), but has a small modiﬁcation which ensures that derivations are not repeated. In this rule, only the intervals computed at qi are considered for B=VB (set I+). However, event A is matched against the occurrences at qi, excluding the occurrences that were inserted to the system at qi (set IQi \ I+). This means that we examine only time-points falling in (qi ω, qi 1] and present to the system from the previous query time qi 1, excluding the ones, if any, that were retracted at qi. This is important in order to avoid repeating evaluations. If in rule (6)(b) we used all the time-points of event A at qi, then we would have to repeat evaluations, since the inserted time-points of event A at qi (set I+) are included in the occurrences of the event at qi. The negative body literals are evaluated as in rule (6)(a). Figure 4(b) shows the calculation of an initiation point belonging to (qi ω, qi 1] by using rule (6)(b). Rule (6)(c) examines if a retracted occurrence of event C (set I ) can lead to a new initiation point. Recall from rule (1) that we demand the absence of event C in order for the rule to be satisﬁed. Thus, if at qi there are deleted occurrences of event C, this means

Tsilionis, Artikis, & Paliouras

qi qi-1 qi - ω

Rule (6)(a)

[happens At(A, T)] I ,

not [holds At(D=VD,T)] I .

[holds At(B=VB, T)] I ,

not [happens At(C, T)] I ,

Rule (6)(b)

[happens At(A, T)] I \ I ,

[holds At(B=VB, T)] I ,

not [happens At(C, T)] I ,

not [holds At(D=VD, T)] I .

qi qi-1 qi - ω

Rule (6)(c)

qi qi-1 qi - ω

[happens At(C, T)] I ,

[happens At(A, T)] I \ I ,

not [holds At(D=VD, T)] I .

Qi [holds At(B=VB, T)] I \ I ,

Rule (6)(d)

not [happens At(C, T)] I U I .

[holds At(D=VD, T)] I ,

[happens At(A, T)] I \ I ,

Qi [holds At(B=VB, T)] I \ I ,

qi qi-1 qi - ω

Figure 4: Illustration of the addition phase. Dots represent event occurrences, unlabeled horizontal lines represent ﬂuents intervals and arrows facing upwards represent initiation points. The black color signiﬁes event occurrences and ﬂuent intervals present both at qi 1 and qi, the green color signiﬁes events arriving at, and ﬂuent intervals computed at qi, while the red color signiﬁes event occurrences and ﬂuent intervals retracted at qi. Enlarged dots and lines denote participation in the addition phase. The vertical dashed lines indicate the time-points and intervals that give rise to a new initiation point. Each of the four illustrations corresponds to a delta rule of rule-set (6). The non-overlapping part of the two query times, (qi 1, qi], is greyed out in all illustrations since we focus on the overlapping part, i.e. (qi ω, qi 1].

Incremental Event Calculus for Run-Time Reasoning

that event C did not actually occur at the previously reported time-points, and thus an initiation point of F=V should have been calculated. Notice that the conditions of the rule have been re-ordered for performance reasons. Now, event C is evaluated ﬁrst. This favors computation since the time-points in set I are usually few. Figure 4(c) shows the calculation of an initiation point belonging to (qi ω, qi 1] by using rule (6)(c). Rule (6)(d) examines the deleted intervals of D=VD. At qi 1 an interval of D=VD may have prevented the calculation of an initiation point. If at qi this interval of D=VD is deleted, the rule will produce a new initiation point. We employ the same optimisations as in the previous two rules, but we also introduce a new one concerning negative literals. In rule (6)(c) we examined event C over the time-points in I . If any of these points led to new initiation points, then these derivations should not be repeated in rule (6)(d). In order to achieve this, we evaluate event C negatively over all its occurrences at qi including the deleted ones (set IQi I ). In other words, we take into account the occurrences in I , which are not present at qi in order to avoid repeating derivations. Figure 4(d) shows the calculation of an initiation point belonging to (qi ω, qi 1] by means of rule (6)(d). In each of the four delta rules, a body literal is evaluated over the set I+ or I . In practice these sets are small, compared to the set of all event occurrences and ﬂuent intervals, i.e. the set IQi. By using these small sets, the evaluation is faster and the performance is improved compared to the recomputation from scratch of RTEC. The introduced optimisations also ensure that a new initiation point can be produced by only one of the four delta rules.

3.3 Complexity Analysis

We present a worst-case complexity analysis of RTECinc concerning simple ﬂuents. We focus on the cost of computing maximal simple ﬂuent intervals in the overlap of two consecutive query times, i.e. (qi ω, qi 1]. We omit the non-overlapping part, (qi 1, qi], since the cost is the same in RTEC and RTECinc. Moreover, we make assumptions that lead to the worst overall cost of the incremental procedure as opposed to the worst case of one of its phases. We make this choice because the assumptions leading to the worst case of one phase, e.g. the addition phase, do not necessarily lead to the worst case of the other (e.g. deletion) phase. Table 4 summarizes the notation of the complexity analysis. We assume that there are e events and ﬂuents in the body of an initiated At/terminated At rule deﬁning a simple ﬂuent. Incremental reasoning makes sense only in the presence of delays and retractions falling in the overlap of two consecutive windows. Therefore, we assume that each of the e body literals of a rule will have n delayed insertions and retractions, i.e. n > 0.

3.3.1 Analysis of RTECinc

Deletion phase. In the deletion phase of the incremental procedure, the initiation and termination points se Qi 1 calculated at qi 1 and falling in (qi ω, qi 1] are examined in order to determine if they hold at qi. This is achieved by the use of delta rules of type (5) presented on page 977. Assuming that there are l rules of type (1) (page 969) deﬁning a ﬂuent, then we will have l delta rules of type (5). In these delta rules each of the initiation/termination points calculated at qi 1 is checked against the delayed insertions of positive body events and ﬂuents, as well as against the retractions of negative events and ﬂuents. Evaluating

Tsilionis, Artikis, & Paliouras

Table 4: Complexity analysis notation.

Notation Meaning m OV Number of time-points in (qi ω, qi 1]

e Number of event and ﬂuent types

n Number of event occurrences or ﬂuent intervals inserted and retracted at (qi 1, qi] and overlapping (qi ω, qi 1]

l Number of simple ﬂuent rules

a happens At predicate expressing an event in the body of a delta rule of type (5) requires retrieving the event s inserted or retracted time-points from the computer memory, and checking whether the initiation/termination point under investigation coincides with one of these inserted/retracted time-points. Similarly, evaluating a holds At predicate requires retrieving from the computer memory the inserted or retracted intervals of the ﬂuent, and checking whether the initiation/termination point belongs to these intervals. In the worst case, all body literals of the delta rules of type (5) will be evaluated, and thus, the cost of the deletion phase is bound by:

O l |se Qi 1| e n , (7)

where l is the number of the delta rules, |se Qi 1| represents the number of initiation/ termination points calculated at qi 1 and falling in (qi ω, qi 1], e is the number of events and ﬂuents in the body of a delta rule, and n the number of insertions/deletions of an event or ﬂuent within (qi ω, qi 1], as recorded in (qi 1, qi] (see Table 4). Notice that in practice l and e are small, while |se Qi 1| and n are only a small subset of a sliding window. Addition phase. The purpose of the addition phase is to compute the initiation/ termination points that are the result of delayed insertions and retractions, by using delta rules of type (6). Each initiated At/terminated At rule contains e events and ﬂuents. Therefore, each of the l rules of type (1) is transformed to e delta rules of type (6), leading to l e delta rules in total. In each of these rules, only one event or ﬂuent is evaluated over its delayed insertions or retractions. Next, the insertions/retractions of this event or ﬂuent are checked against all the occurrences and intervals of the remaining body events and ﬂuents in order to produce a new initiation/termination point. In the worst case, each insertion/retraction of an event and each inserted/retracted interval of a ﬂuent will contribute a new initiation/termination point, and thus, the cost of computing the initiation/termination points by using all l e delta rules is bound by:

O l e2 n |IQi| , (8)

where |IQi| is the number of event occurrences or ﬂuent intervals in (qi ω, qi 1]. The addition phase is more expensive than the deletion phase. During the addition phase, we have to evaluate more delta rules than the deletion phase. Furthermore, in the evaluation of a delta rule in the addition phase we examine the delayed insertions or

Incremental Event Calculus for Run-Time Reasoning

retractions of the ﬁrst body literal against all the occurrences or intervals of the remaining body literals; in contrast, in the deletion phase the body literals are evaluated only over their delayed insertions and retractions. The number of delayed insertions and retractions is usually less than the number of all event occurrences and ﬂuent intervals. Calculation of inserted and retracted ﬂuent intervals. The ﬁnal steps of the incremental procedure consist of constructing intervals from starting and ending points, and calculating the inserted and retracted intervals of ﬂuents (see Algorithm 1 on page 976). The cost of the former step is the same as that of RTEC. For the latter step, the cost of the symmetric diﬀerence (see line 5 of Algorithm 1) of the intervals computed at qi 1, IQi 1, and the intervals computed at qi, IQi, is limited by the sum of the sizes of the two lists, given that each list of maximal intervals is sorted. Since the intervals calculated at qi 1 are at most |se Qi 1|/2 and the intervals calculated at qi are at most |se Qi|/2, the cost of the

symmetric diﬀerence is bound by O(|se Qi|+|se Qi 1|

3.3.2 Comparison of RTEC and RTECinc

We examine the conditions in which RTECinc is preferable over RTEC. The worst case scenario for RTEC does not coincide with the worst case scenario for RTECinc and vice versa. Hence, we perform a comparison for the worst case of each CER engine.

Worst case for RTEC. In evaluation from scratch, the worst case is that every timepoint of the overlap, (qi ω, qi 1], is promoted to an initiation/termination point by all

initiated At/terminated At rules. Therefore, the cost of RTEC is bound by:

O l m2 OV e , (9)

where m OV represents the number of time-points in the overlap (qi ω, qi 1]. In the setting of the worst case for RTEC, the ﬁrst body literal of an initiated At/terminated At rule has m OV occurrences. Furthermore, these occurrences must coincide with the m OV occurrences of each of the remaining positive body events and belong to the m OV/2 intervals of the remaining positive body ﬂuents. Additionally, there are no occurrences/intervals of negative events and ﬂuents or all their occurrences/intervals are retracted at qi. Based on the above, and performing the appropriate substitutions and simpliﬁcations in the total cost of RTECinc and in formula (9) of RTEC, we derive inequality (10), that expresses the conditions in which RTECinc is preferable to RTEC:

m OV < 1 (10)

In other words, the most important factor for improving performance eﬀectively is the ratio of delayed insertions/retractions to the degree of overlap. The worst case of RTEC is extreme. However, the analysis of this case allows us to unravel the factors that most inﬂuence the performance of incremental evaluation. Intuitively, when there are few delays, RTECinc is the best option.

Worst case for RTECinc. In incremental evaluation, the worst case is when every timepoint in the overlap leads to an initiation/termination point at qi 1, and at qi all these

Tsilionis, Artikis, & Paliouras

points are retracted. In this case, RTEC is the preferred choice, since the cost of RTEC is close to zero, as there are no time-points that can be promoted to initiation/termination points. Deleting all initiation/termination points calculated at qi 1 is an inevitable operation for RTECinc, baring the cost of Eq. (7). In such extreme cases, where n is equal to (or approximates) m OV, RTEC is the preferred option.

A typical case. Based on the experience of the authors in a range of CER applications, including the two domains used in this paper (see Section 5), the following conditions hold:

The number of initiated At/terminated At rules (l) is in the same order of magnitude as the number of ﬂuent and event types (e).

|se Qi 1| |se Qi| m OV. To aid the presentation, we will represent the average number of starting and ending points of all windows as |se|.

The number of delays/retractions n is also a small fraction of the the window overlap, i.e. n = m OV

c , where c is a constant.

By performing the appropriate substitutions in the total cost of RTECinc and in formula (9) of RTEC, under these assumptions and minor simpliﬁcations, we conclude with the following inequality:

e > m OV . (11)

We observe again that the number of delays and retractions is the most crucial factor in order for RTECinc to improve performance, where few delays favor RTECinc.

4. Incremental Evaluation of Statically Determined Fluents

In this section, we present the incremental evaluation of statically determined ﬂuents. Recall e.g. rule (3) in Section 2.1.2 (page 971) that expresses a statically determined ﬂuent deﬁnition. To simplify the presentation, when we refer to the intervals of a ﬂuent we omit the value of the ﬂuent, e.g. the intervals of F=V are represented as IF . The evaluation of rule (3) will produce the intervals of F=V , IF , by combining the intervals IA and IB according to the given interval manipulation construct (see Table 1 on page 969). We will use rule (3) to illustrate the presentation that follows, as well as the example intervals of A=VA and B=VB that are displayed in Table 6. We will describe the incremental process for each of the three manipulation constructs, focusing on the overlapping part of the two query times, (qi ω, qi 1], and assuming without loss of generality that the deﬁnition of a statically determined ﬂuent depends only on two ﬂuents (as in rule (3)). This is because a holds For rule with k body ﬂuents may be rewritten as a set of holds For rules with two ﬂuents, each connected by means of auxiliary ﬂuents. Table 5 presents the notation concerning the analysis of statically determined ﬂuents. Moreover, in the sections that follow, we will use equation (12), stating that the intervals IQi F of F=V at qi, overlapping (qi ω, qi 1], are equal to the intervals produced by ﬁrst calculating the relative complement of IQi 1 F with respect to I F , i.e. by calculating the intervals that are not retracted at qi, and then computing the union of these intervals with the intervals I+ F that are inserted at qi.

Incremental Event Calculus for Run-Time Reasoning

Table 5: Notation for interval manipulation and comparison.

Notation Description

IA T IB union all operation of RTEC IA T IB intersect all operation of RTEC IA \T IB relative complement all operation of RTEC IA T IB Temporal subset; it denotes that the time-points of the intervals of IA are included in the intervals of IB.

Table 6: Running example for the evaluation of rule (3)

IQi 1 [(10, 15), (23, 30), (40, 50), (60, 70)] [(17, 21), (26, 35), (43, 47), (54, 65)]

I [(10, 12), (27, 30), (43, 47)] [(19, 21), (43, 47), (60, 65)]

I+ [(54, 58), (80, 90), (95, 100)] [(37, 41), (82, 87), (105, 120)]

IQi 1 A T B [(10, 15), (17, 21), (23, 35), (40, 50), (54, 70)]

IQi 1 A T B [(26, 30), (43, 47), (60, 65)]

IQi 1 A\T B [(10, 15), (23, 26), (40, 43), (47, 50), (65, 70)]

IQi 1 B\T A [(17, 21), (30, 35), (54, 60)]

IQi F =(IQi 1 F \T I F ) T I+ F (12)

4.1 Interval Union

First, we deal with the case of union all and assume that at the previous query time, qi 1, the list of intervals of the body ﬂuents of rule (3) is not empty, that is, IQi 1 A , IQi 1 B = . If this is not the case, then incremental evaluation will not improve performance. Furthermore, we choose to analyze the case where each body ﬂuent has retractions and insertions, that is, I A, I B, I+ A, I+ B = , as this is the most expensive case in terms of processing time.

If we compute everything from scratch, as RTEC does, the evaluation of rule (3) leads to the following computation: IQi F = IQi A T IQi B (13)

The size of IQi A and IQi B is, in the worst case, |IQi A | = |IQi 1 A | + |I A| + |I+ A| and |IQi B | = |IQi 1 B | + |I B| + |I+ B| respectively. To clarify, the worst case is for a deletion to break an existing interval to two parts and for an insertion to introduce a new interval, rather than extending an existing one. Additionally, in the worst case the lists of maximal intervals of ﬂuents A and B are non-overlapping, and since the two lists are temporally

Tsilionis, Artikis, & Paliouras

sorted, the cost of Eq. (13) is limited by the sum of the sizes of the two lists:

O |IQi 1 A | + |I A| + |I+ A| + |IQi 1 B | + |I B| + |I+ B| (14)

4.1.2 RTECinc

In RTECinc, we want to produce the intervals of union all at qi by using the intervals of

union all calculated at qi 1 along with the deletions and insertions of the two body ﬂuents at qi. The incremental evaluation of union all is the following:

(IQi 1 A T B) \T 5

(IQi 1 A T B) i T 4

The interested reader may refer to the Appendix for the derivation of the incremental formula from the non-incremental one of RTEC. IQi 1 A T B represents the result of rule (3) at

qi 1, and the term IQi 1 A T B represents the common time-points of the two body ﬂuents at qi 1, i.e. their intersection. Both these terms are known at qi. The circled numbers express just one of the many possible orders of execution and serve as pointers when we want to refer to a speciﬁc operation. Moreover, some of these operations may be performed in parallel (e.g. operations 2 , 3 and 6 ). First the intervals that must be removed from IQi 1 A T B are considered (see operations 1 5 in Eq. (15)). We distinguish between two cases:

(a) The time-points belonging to the retracted intervals of both body ﬂuents, A and B (see operation 3 in Eq. (15)), must be removed from the intervals of F that were calculated at the previous query time qi 1 (operation 5 in Eq. (15)).

(b) The time-points belonging to the retracted intervals of one body ﬂuent, and not belonging to the intervals of the other body ﬂuent that were computed at qi 1(operations 1 and 2 in Eq. (15)), must be deleted from the intervals of F (operation 5 in Eq. (15)).

Subsequently, the union of the insertions of the two body ﬂuents (operation 6 in Eq. (15)) is added to the outcome of operation 5 in Eq. (15) (operation 7 in Eq. (15)). In Table 7 we illustrate the evaluation of Eq. (15) with the aid of the example of Table 6, where we provide the output of each operation. We proceed with the complexity analysis of Eq. (15). Without loss of generality, assume that |IQi 1 A | |IQi 1 B |. We assume the worst case scenario for all the operations, including those involving retractions and insertions. For the operation I A T I B the worst case is for the intervals to be disjoint and thus for the output list to be of size |I A|+|I B|. This implies that the output of operation I A T I B will be . We selected this case as it leads both to the largest cost and output for operation 5 . In Table 8 we present the cost of each operation of Eq. (15) as well as the size of the output list. The latter facilitates understanding the steps of the complexity analysis, since the output list of an operation participates in a subsequent operation. Notice that in the complexity analysis presented in Table 8 the cost of operation 4 is zero since the output of operation 3 is empty. Furthermore, the output of the ﬁnal

Incremental Event Calculus for Run-Time Reasoning

operation ( 7 ) is not of interest since it is not used in any other operation. The overall cost of Eq. (15) is produced by summing the cost of all operations and is the following:

2|IQi 1 A T B| + 3|IQi 1 A T B| + 5(|I A| + |I B|) + 2(|I+ A| + |I+ B|)

Table 7: Evaluation of Eq. (15) using the running example of Table 6

Operation Result

1 [(10, 12), (19, 21), (27, 30), (43, 47), (60, 65)]

2 [(10, 12), (19, 21)]

3 [(43, 47)]

4 [(10, 12), (19, 21), (43, 47)]

5 [(12, 15), (17, 19), (23, 35), (40, 43), (47, 50), (54, 70)]

6 [(37, 41), (54, 58), (80, 90), (95, 100), (105, 120)]

7 [(12, 15), (17, 19), (23, 35), (37, 43), (47, 50), (54, 70), (80, 90), (95, 100), (105, 120)]

Table 8: Complexity analysis of Eq. (15)

Operation Cost Output

1 |I A| + |I B| |I A| + |I B|

2 |IQi 1 A T B| + |I A| + |I B| |IQi 1 A T B| + |I A| + |I B|

3 |I A| + |I B|

4 0 |IQi 1 A T B| + |I A| + |I B|

5 |IQi 1 A T B| + |IQi 1 A T B| + |I A| + |I B| |IQi 1 A T B| + |IQi 1 A T B| + |I A| + |I B|

6 |I+ A| + |I+ B| |I+ A| + |I+ B|

7 |IQi 1 A T B| + |IQi 1 A T B|+ -

|I A| + |I B| + |I+ A| + |I+ B|

4.1.3 Comparison of RTEC and RTECinc.

Our goal is to identify the conditions under which the incremental evaluation of statically determined ﬂuents using union all is more eﬃcient than the computation from scratch (RTEC).

Tsilionis, Artikis, & Paliouras

Table 9: RTECinc vs RTEC: union all. In all lines below, |I A| = |I B| = 0.

|I+ A| |I+ B| Conditions in which RTECinc outperforms RTEC

= 0 = 0 |IQi 1 A T B| < 2|IQi 1 AB | 2|I + AB |

0 = 0 |IQi 1 A T B| < 2|IQi 1 AB |

= 0 0 |IQi 1 A T B| < 2|IQi 1 AB |

To achieve this, we compare the cost of RTEC, as given by Eq. (14), and that of RTECinc see Eq. (16). To facilitate the presentation:

We will use |IQi 1 AB |, as opposed to |IQi 1 A | and |IQi 1 B |, to represent the number of intervals of the two body ﬂuents at the previous query time.

We will sometimes use |I + AB |, as opposed to |I A|, |I B|, |I+ A| and |I+ B|, to represent the number of deletions/insertions of the two body ﬂuents.

RTECinc is preferable to RTEC when:

2|IQi 1 A T B| + 3|IQi 1 A T B| + 14|I + AB |

2|IQi 1 AB | + 4|I + AB |

|IQi 1 A T B| < |IQi 1 AB | 3

2|IQi 1 A T B| 5|I + AB |

Unfortunately, this inequality does not hold in the presence of deletions, even if we assume the best case for the individual operations of Eq. (15). In the case of the absence of deletions, i.e. when I A = I B = , Eq. (15), expressing the incremental evaluation of union all, is simpliﬁed to the considerably cheaper following operations:

IQi 1 A T B T h I+ A T I+ B i (18)

Table 9 displays the conditions in which RTECinc is preferable to RTEC. Figure 5 illustrates, with the use of an example, the case displayed in the ﬁrst line of Table 9. RTECinc is preferred over RTEC as the number of intervals of the two body ﬂuents increases and the number of delayed insertions decreases. The cases in which RTEC is preferable can be found in the complete comparison Table presented in the Appendix.

4.2 Interval Intersection

In this section, we present the incremental evaluation of the interval manipulation construct

intersect all and the beneﬁts in performance.

Incremental Event Calculus for Run-Time Reasoning

Figure 5: An example where RTECinc outperforms RTEC. The upper part shows the result of union all at qi 1, while the bottom part displays the result of union all at qi. ω expresses the sliding window. Unlabelled horizontal lines represent ﬂuent intervals. Green lines express ﬂuent intervals inserted in (qi 1, qi]. Vertical dashed lines denote the common intervals of IA T B at both query times. The non-overlapping part of the two query times, (qi 1, qi], in the bottom part of the ﬁgure is greyed out.

If we compute everything from scratch, the cost of evaluating intersect all is bound by Eq. (14) again. In the worst case, the cost of intersect all is limited by the sum of the sizes of the two lists of intervals.

4.2.2 RTECinc

We express the intervals of intersect all at qi by taking into consideration the result of intersection at qi 1, as well as the intervals of the two body ﬂuents of rule (3) (page 971) that were not part of intersection at qi 1. These body ﬂuent intervals, represented as IQi 1 A\T B and

IQi 1 B\T A, may contribute to the evaluation of intersect all at qi in the case of delayed insertions. The incremental evaluation of intersect all is as follows:

IQi F = IQi 1 A T B \T 2

I B) i T 10 " I+ B T 5

h (IQi 1 A\T B \T 3

h (IQi 1 B\T A \T 6

I+ B i # (19)

The derivation of the incremental formula of RTECinc from the non-incremental one of RTEC is presented in the Appendix. Terms IQi 1 A\T B, IQi 1 B\T A and IQi 1 A T B are known at qi. The circled numbers express a sequence of operations leading to the evaluation of Eq. (19).

Tsilionis, Artikis, & Paliouras

Table 10: Evaluation of Eq. (19) with the use of the running example of Table 6

Operation Result

1 [(10, 12), (19, 21), (27, 30), (43, 47), (60, 65)]

2 [(26, 27)]

3 [(12, 15), (23, 26), (40, 43), (47, 50), (65, 70)]

4 [(12, 15), (23, 26), (40, 43), (47, 50), (54, 58), (65, 70), (80, 90), (95, 100)]

5 [(40, 41), (82, 87)]

6 [(17, 19), (30, 35), (54, 60)]

7 [(17, 19), (30, 35), (37, 41), (54, 60), (82, 87), (105, 120)]

8 [(54, 58), (82, 87)]

9 [(40, 41), (54, 58), (82, 87)]

10 [(26, 27), (40, 41), (54, 58), (82, 87)]

In Eq. (19), we ﬁrst delete from IQi 1 A T B the retracted intervals of A=VA and B=VB (see operations 1 2 in Eq. (19)). Next, we compute the inserted intervals of B=VB that belong to IQi 1 A\T B, after excluding the deletions of A=VA, or to the insertions of A=VA (operations 3 5 ). We do the same for the insertions of A=VA (operations 6 8 ). Finally, the union of these operations provides the intersection of the intervals of the two ﬂuents at qi (operations 9 10 ). Table 10 illustrates the incremental evaluation of intersect all with the use of the running example displayed in Table 6 on page 985. For the complexity analysis, we consider, as usual, the highest overall cost and the largest possible output. The conditions that lead to the highest cost are the following:

1. IQi 1 A T B, IQi 1 A\T B, IQi 1 B\T A = .

2. I A T I B = .

3. IQi 1 A T B remains unaﬀected after operation 2 of Eq. (19).

4. I A splits all the intervals of IQi 1 A\T B in operation 3 and I B splits the intervals of IQi 1 B\T A in operation 6 .

5. I+ B overlaps completely IQi 1 A\T B \T I A and partially I+ A. Accordingly I+ A overlaps com-

pletely IQi 1 B\T A \T I B and partially I+ B.

A detailed analysis of the cost of each operation of Eq. (19) is presented in Table 11, in a

similar way as in Section 4.1. The output of operation 9 contains the term |I+ A|+|I+ B| 2 , since operations 5 and 8 produce some common intervals, which must be considered only once.

Incremental Event Calculus for Run-Time Reasoning

Table 11: Complexity analysis of Eq. (19)

Operation Cost Output

1 |I A| + |I B| |I A| + |I B|

2 |IQi 1 A T B| + |I A| + |I B| |IQi 1 A T B|

3 |IQi 1 A\T B| + |I A| |IQi 1 A\T B| + |I A|

4 |IQi 1 A\T B| + |I A| + |I+ A| |IQi 1 A\T B| + |I A| + |I+ A|

5 |IQi 1 A\T B| + |I A| + |I+ A| + |I+ B| |IQi 1 A\T B| + |I A| + |I+ A|

6 |IQi 1 B\T A| + |I B| |IQi 1 B\T A| + |I B|

7 |IQi 1 B\T A| + |I B| + |I+ B| |IQi 1 B\T A| + |I B| + |I+ B|

8 |IQi 1 B\T A| + |I B| + |I+ B| + |I+ A| |IQi 1 B\T A| + |I B| + |I+ B|

9 |IQi 1 A\T B| + |I A| + |I+ A|+ |IQi 1 A\T B| + |I A| + |IQi 1 B\T A|+

|IQi 1 B\T A| + |I B| + |I+ B| |I B| + |I+ A|+|I+ B| 2

10 |IQi 1 A T B| + |IQi 1 A\T B| + |I A|+ -

|IQi 1 B\T A| + |I B| + |I+ A|+|I+ B| 2

Those intervals are the result of the intersection of I+ A and I+ B. By summing the costs of all operations, we compute the bound of incremental intersect all:

2|IQi 1 A T B| + 5(|IQi 1 A\T B| + |IQi 1 B\T A|) + 9

2(|I+ A| + |I+ B|) + 7(|I A| + |I B|)

4.2.3 Comparison of RTEC and RTECinc.

We compare the cost of RTEC as given by Eq. (14), and that of RTECinc see Eq. (20). To facilitate the presentation, we simplify notation as follows:

We will use |IQi 1 AB |, as opposed to |IQi 1 A | and |IQi 1 B |, to represent the number of intervals of the two body ﬂuents at the previous query time.

We will use |IQi 1 A\T B| to denote |IQi 1 A\T B| and |IQi 1 B\T A|.

We will sometimes use |I + AB |, as opposed to |I A|, |I B|, |I+ A| and |I+ B|, to represent the number of deletions/insertions of the two body ﬂuents.

Tsilionis, Artikis, & Paliouras

Table 12: RTECinc vs RTEC: intersect all.

|I A| |I+ A| |I B| |I+ B| Conditions in which RTECinc outperforms RTEC

= 0 0 = 0 = 0 |IQi 1 A T B| < |IQi 1 AB | 3

2|IQi 1 A\T B| 2|I + AB |

0 0 = 0 = 0 |IQi 1 A T B| < |IQi 1 AB | |IQi 1 A\T B|

0 = 0 0 = 0 |IQi 1 A T B| < 2|IQi 1 AB | 8|IQi 1 A\T B| 7

= 0 0 0 = 0 |IQi 1 A T B| < |IQi 1 AB | 3

2|IQi 1 A\T B| 3

0 0 0 = 0 |IQi 1 A T B| < 2|IQi 1 AB | 2|IQi 1 A\T B|

= 0 = 0 = 0 0 |IQi 1 A T B| < |IQi 1 AB | 3

2|IQi 1 A\T B| 2|I + AB |

0 = 0 = 0 0 |IQi 1 A T B| < |IQi 1 AB | 3

2|IQi 1 A\T B| 3

= 0 0 = 0 0 |IQi 1 A T B| < 2|IQi 1 AB | 2|I + AB |

0 0 = 0 0 |IQi 1 A T B| < 2|IQi 1 AB |

= 0 = 0 0 0 |IQi 1 A T B| < |IQi 1 AB | |IQi 1 A\T B|

0 = 0 0 0 |IQi 1 A T B| < 2|IQi 1 AB | 2|IQi 1 A\T B|

= 0 0 0 0 |IQi 1 A T B| < 2|IQi 1 AB |

Then, RTECinc is preferable to RTEC when:

O 2|IQi 1 AB | + 4|I + AB | > O 2|IQi 1 A T B| + 10|IQi 1 A\T B| + 23|I + AB |

|I Qi 1 A T B| < |I Qi 1 AB | 5|I Qi 1 A\T B| 19

2 |I + AB |

Unfortunately, the above inequality cannot be satisﬁed. In other words, when the conditions 1 5 stated above hold, RTEC is always the best option. Table 12 presents those cases in which RTECinc outperforms RTEC. In Figure 6, we display an example that illustrates one of the cases. The complete comparison can be found in the Appendix. In the absence of deletions, as in the example of Figure 6, Eq. (19) may be simpliﬁed to:

IQi 1 A T B T h I+ A T IQi B i T h I+ B T IQi A i (22)

This is a much cheaper computation of intersect all as opposed to Eq (19).

Incremental Event Calculus for Run-Time Reasoning

Figure 6: An example where RTECinc outperforms RTEC. This corresponds to the third line of Table 12. ω expresses the sliding window. The upper part of the ﬁgure shows the result of intersect all at qi 1, while the bottom part displays the result of intersect all at qi. Unlabelled horizontal lines represent ﬂuent intervals. Green lines express ﬂuent intervals inserted in (qi 1, qi]. Vertical dashed lines denote the common intervals of the two body ﬂuents, IA T B, at both query times.

4.3 Relative Complement

The last interval manipulation construct of RTEC is relative complement all. We examine the case of the relative complement of A=VA with respect to B=VB.

Since we refer to two ﬂuents, the computation from scratch of RTEC is bound once again by Eq. (14) (Artikis et al., 2015).

4.3.2 RTECinc

In the incremental case, we express the new intervals produced by relative complement all by taking into consideration not only the result of relative complement all at qi 1 but also the intervals at qi 1 that hold for ﬂuent B and not for A, as well as their common intervals. We represent these two sets of intervals as IQi 1 B\T A and IQi 1 A T B, respectively. The incremental formula of

relative complement all is the following:

Tsilionis, Artikis, & Paliouras

IQi F = IQi 1 A\T B \T 2

I+ B) T 10 " I+ A \T 5

h (IQi 1 B\T A \T 3

h (IQi 1 A T B \T 6

I+ A i # (23)

The derivation of Eq. (23) from the non-incremental one of RTEC may be found in the Appendix. First, the intervals that have been inserted for B=VB or deleted from A=VA and belong to IQi 1 A\T B, must be retracted (see operations 1 2 in Eq. (23)). Next, we calculate

which of the inserted intervals of A=VA are not in IQi 1 B\T A, after removing the deletions of B=VB, and/or in the insertions of B=VB (see operations 3 5 ). Subsequently, we include in the result the retracted intervals of B=VB that are among the inserted intervals of A=VA or belong to IQi 1 A T B, after discarding the deletions of A=VA (see operations 6 8 ). Finally, the union of the results of these operations constitutes the result of relative complement of A=VA with respect to B=VB at qi (operations 9 10 ). Table 13 illustrates the evaluation of Eq. (23) with the use of our running example (see Table 6 on page 985). The worst case, in terms of complexity, concerning the evaluation of incremental

relative complement all, arises when the following conditions hold:

1. IQi 1 A T B, IQi 1 A\T B, IQi 1 B\T A = .

2. I A T I+ B = .

3. IQi 1 A\T B remains unaﬀected after operation 2 of Eq. (23).

4. IQi 1 B\T A remains unaﬀected after operation 3 of Eq. (23).

5. I A splits all the intervals of IQi 1 A T B (operation 6 of Eq. (23)).

6. IQi 1 B\T A, I+ B T I+ A.

7. I B T (IQi 1 A T B\T I A) and I B T I+ A = . This condition implies that IQi 1 A T B is produced by the intervals of A=VA and B=VB at qi 1 that are exactly the same.

Table 14 presents the steps of the complexity analysis. The cost of incremental rela-

tive complement all is bound by:

2|IQi 1 A\T B| + 5(|IQi 1 B\T A| + |IQi 1 A T B|) + 5|I+ A| + 6|I+ B| + 7|I A| + 2|I B|

4.3.3 Comparison of RTEC and RTECinc.

We compare the cost of RTEC as given by Eq. (14), and that of RTECinc. To facilitate the presentation we make the usual simpliﬁcations in notation:

Incremental Event Calculus for Run-Time Reasoning

Table 13: Evaluation of Eq. (23) with the use of the running example of Table 6

Operation Result

1 [(10, 12), (27, 30), (37, 41), (43, 47), (82, 87), (105, 120)]

2 [(12, 15), (23, 26), (41, 43), (47, 50), (65, 70)]

3 [(17, 19), (30, 35), (54, 60)]

4 [(17, 19), (30, 35), (37, 41), (54, 60), (82, 87), (105, 120)]

5 [(80, 82), (87, 90), (95, 100)]

6 [(26, 27), (60, 65)]

7 [(26, 27), (54, 58), (60, 65), (80, 90), (95, 100)]

8 [(60, 65)]

9 [(60, 65), (80, 82), (87, 90), (95, 100)]

10 [(12, 15), (23, 26), (41, 43), (47, 50), (60, 70), (80, 82), (87, 90), (95, 100)]

Table 14: Complexity analysis of Eq. (24)

Operation Cost Output

1 |I A| + |I+ B| |I A| + |I+ B|

2 |IQi 1 A\T B| + |I A| + |I+ B| |IQi 1 A\T B|

3 |IQi 1 B\T A| + |I B| |IQi 1 B\T A|

4 |IQi 1 B\T A| + |I+ B| |IQi 1 B\T A| + |I+ B|

5 |IQi 1 B\T A| + |I+ B| + |I+ A| |IQi 1 B\T A| + |I+ B| + |I+ A|

6 |IQi 1 A T B| + |I A| |IQi 1 A T B| + |I A|

7 |IQi 1 A T B| + |I A| + |I+ A| |IQi 1 A T B| + |I A| + |I+ A|

8 |IQi 1 A T B| + |I A| + |I+ A| + |I B| |IQi 1 A T B| + |I A|

9 |IQi 1 B\T A| + |I+ B| + |I+ A|+ |IQi 1 B\T A| + |IQi 1 A T B|+

|IQi 1 A T B| + |I A| |I A| + |I+ A| + |I+ B|

10 |IQi 1 A\T B| + |IQi 1 B\T A| + |IQi 1 A T B|+ - |I A| + |I+ A| + |I+ B|

Tsilionis, Artikis, & Paliouras

Table 15: RTECinc vs RTEC: relative complement all.

|I A| |I+ A| |I B| |I+ B| Conditions in which RTECinc outperforms RTEC

= 0 0 = 0 = 0 |IQi 1 A\T B| < |IQi 1 AB | 3

2|IQi 1 A T B| 5

0 0 = 0 = 0 |IQi 1 A\T B| < |IQi 1 AB | |IQi 1 A T B|

= 0 = 0 0 = 0 |IQi 1 A\T B| < |IQi 1 AB | 3

2|IQi 1 B\T A| 3|I + AB |

0 = 0 0 = 0 |IQi 1 A\T B| < |IQi 1 AB | 3

2|IQi 1 B\T A| 2|I + AB |

= 0 0 0 = 0 |IQi 1 A\T B| < 2|IQi 1 AB | 2|I + AB |

0 0 0 = 0 |IQi 1 A\T B| < 2|IQi 1 AB |

= 0 0 = 0 0 |IQi 1 A\T B| < |IQi 1 AB | 3

2|IQi 1 A T B| 3

0 0 = 0 0 |IQi 1 A\T B| < 2|IQi 1 AB | 2|IQi 1 A T B|

= 0 = 0 0 0 |IQi 1 A\T B| < |IQi 1 AB | |IQi 1 B\T A| |I + AB |

0 = 0 0 0 |IQi 1 A\T B| < 2|IQi 1 AB | 2|IQi 1 B\T B| |I + AB |

= 0 0 0 0 |IQi 1 A\T B| < 2|IQi 1 AB |

We will use |IQi 1 AB |, as opposed to |IQi 1 A | and |IQi 1 B |, to represent the number of intervals of the two body ﬂuents at the previous query time.

We will use |I + AB |, as opposed to |I A|, |I B|, |I+ A| and |I+ B|, to represent the number of deletions/insertions of the two body ﬂuents.

RTECinc is preferable to RTEC when:

O 2|IQi 1 AB | + 4|I + AB | > O 2|IQi 1 A\T B| + 5(|IQi 1 B\T A| + |IQi 1 A T B|) + 20|I + AB |

|IQi 1 A\T B|< |I Qi 1 AB | 5

2(|I Qi 1 B\T A| + |I Qi 1 A T B|) 8|I + AB |

The above inequality can be satisﬁed only if the deletion and insertion sets of the two body ﬂuents of rule (3) (page 971) are empty. In other words, in the wost case, RTEC is the preferred option over RTECinc. Table 15 presents the conditions in which RTECinc outperforms RTEC. The complete comparison can be found in the Appendix. An example that satisﬁes the inequality of the fourth line of Table 15 is presented in Figure 7.

Incremental Event Calculus for Run-Time Reasoning

Figure 7: An example where RTECinc outperforms RTEC with respect to interval manipulation construct relative complement all. The upper part of the ﬁgure shows the result of

relative complement all at qi 1, while the bottom part the result of relative complement all at qi. Unlabelled horizontal lines represent ﬂuent intervals. Green lines express ﬂuent intervals inserted in (qi 1, qi]. Vertical dashed lines denote the common intervals of IA\T B at both query times.

In the absence of deletions, we simplify the incremental evaluation of relative complement all to the much cheaper formula below: h IQi 1 A\T B \T I+ B i T h I+ A \T IQi B i (26)

5. Empirical Analysis

In this section, we present our empirical analysis on real and synthetic datasets from the ﬁelds of maritime monitoring and ﬂeet management.

5.1 Experimental Setup

RTEC and RTECinc are pure Prolog implementations the source code of both systems is publicly available3,1. RTEC already uses various optimisation techniques for eﬃcient stream reasoning, such as a rule compilation stage that rewrites the rules in order to take advantage of Prolog s built-in indexing mechanism. Details about RTEC may be found in Artikis et al. (2015) and the manual of RTEC4. RTECinc inherits all optimisation techniques

3. https://github.com/aartikis/RTEC 4. https://github.com/aartikis/RTEC/blob/master/RTEC_manual.pdf

Tsilionis, Artikis, & Paliouras

of RTEC. A diﬀerence in the implementation of RTECinc with respect to RTEC concerns the use of Prolog s internal database. The assertion and retraction of clauses is performed by using the internal database in RTECinc and the usual way, i.e. with the use of assertz and retract predicates, in RTEC. The datasets used in the empirical analysis are stored as CSV ﬁles, where each line corresponds to an input event (indicative datasets are available with the code of RTECinc1). We simulated a streaming behavior by consuming the input events little by little, i.e. reading small chunks periodically from the CSV ﬁles according to slide step speciﬁcations. The experiments were designed with the goal of highlighting the gain in performance RTECinc can bring to Composite Event Recognition (CER) in comparison to RTEC. To demonstrate the eﬀect of the overlap between consecutive query times, we employ ﬁve temporal windows with a constant step in all experiments. The experiments were performed on a computer with 24 cores (Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz) and 252 GB of RAM, running Debian GNU/Linux 9 with Kernel 4.9.0-7-amd64 and YAP Prolog 6.2.2.

5.2 Maritime Situational Awareness

Maritime situational awareness concerns the recognition of compostite maritime events, such as ship-to-ship transfer, ﬁshing and dangerous sailing, and is typically achieved by monitoring the messages vessels emit while sailing at sea. These messages are exchanged through the Automatic Identiﬁcation System (AIS5) and contain information about the position, heading, speed, etc of vessels at diﬀerent points in time. The timestamp of an AIS message refers to the time of the measurement on board. By using temporally sorted AIS messages one may reconstruct the trajectory of a vessel. Moreover, it has been shown that the trajectory of a vessel may be compressed by retaining only a subset of the AIS messages, called critical points , which allow for an accurate trajectory reconstruction (Patroumpas et al., 2017). These critical points convey information about the start/end of sailing at a low/high speed, changes in speed/heading, notiﬁcations about communication gaps, turns, etc. Furthermore, the critical points can be spatially processed in order to discover various relations, such as the entrance or exit of a vessel in an area of interest, and the proximity of two vessels (Santipantakis et al., 2018). The critical points along with the spatial relations constitute the input (SDEs) to our system. The composite event description used in our empirical analysis includes sixteen simple ﬂuents and ﬁve statically determined ﬂuents, forming a hierarchy displayed in Figure 1 on page 973. The deﬁnitions of the statically determined ﬂuents contain three operations of

union all, three operations of intersect all, and two operations of relative complement all. Terrestrial and satellite stations collect and forward the AIS messages emitted by the vessels to the CER system. The stations have to deal with a large number of messages and due to this high traﬃc signiﬁcant delays may occur. The CER system must handle in an eﬃcient way the delayed arrival of events in order to minimize the recognition loss. Our empirical evaluation includes delayed SDEs, but not SDE retraction, as SDEs are not retracted in maritime monitoring. However, delayed ﬂuent interval retraction (and assertion) may take place in the higher levels of the ﬂuent hierarchy (see Figure 1 on page 973). Consider, for example, the simple ﬂuent gap in Figure 1. Assume that at the

5. http://www.imo.org/en/Our Work/Safety/Navigation/Pages/AIS.aspx

Incremental Event Calculus for Run-Time Reasoning

(a) Brest, France: Synthetic delays dataset

(b) Europe: Natural delays dataset

Figure 8: Position signals of the two datasets (Pitsikalis et al., 2019).

current query time qi new intervals are computed for gap. This may lead to the retraction of intervals for ﬂuents stopped and low Speed since gap participates negatively in the body of these ﬂuents. Furthermore, the retraction of intervals for the ﬂuents stopped and low Speed will lead to the retraction of intervals for the ﬂuent rendez Vous. We employed two datasets for our empirical analysis; the ﬁrst is a publicly available dataset6 concerning the area of Brest, France, and the second concerns all European seas, and was provided to us by IMIS Global7, our partner in the dat Acron project8. The geographical coverage of the datasets is displayed in Figure 8. Both datasets are temporally sorted. As a result, they did not include delays. For the dataset of Europe, we additionally obtained the AIS messages with timestamps that reveal the time the message left the terrestrial/satellite station. These timestamps diﬀer from the timestamps recorded on board. By using the timestamps of the terrestrial/satellite stations and sorting the dataset we are able to retain the natural delays. For the Brest dataset these timestamps were not available and the experimental evaluation was performed by introducing synthetic delays.

5.2.1 Synthetic delays

The ﬁrst dataset concerns approx. 5K vessels sailing in the Atlantic Ocean around the port of Brest, France, and spans a period from 1 October 2015 to 31 March 2016 (Ray et al., 2019). The dataset consists of approx. 5M critical and spatial SDEs. The delays in this dataset cannot be recovered and an approach of artiﬁcially injecting delays was adopted. We performed a series of 5 experiments, each time diﬀerentiating the amount of SDEs being delayed. We selected uniformly 5%, 10%, 20%, 40% and 80% of the total events to be delayed. We used a uniform distribution for selecting events, since we assume that each event has the same probability to be delayed. Regarding the magnitude of the delay, in the real world, delays of few hours are more probable to happen, even though delays of over 16 hours have been observed in the maritime domain (Camossi et al., 2017; Ray et al., 2016). In order to mimic reality, as much as possible, we used a Gamma distribution with shape parameter k = 2 and scale parameter θ = 2. Thus, a small delay had a higher probability

6. https://zenodo.org/record/1167595/#.X5a Qw C8Rr OQ 7. https://imisglobal.com 8. http://datacron-project.eu

Tsilionis, Artikis, & Paliouras

12h/17K 24h/35K 48h/70K 96h/141K 168h/249K 0

Window (Hours/SDEs)

Avg Time (in seconds)

RTEC inc RTEC

(a) Total recognition time (5%)

12h/17K 24h/35K 48h/70K 96h/141K 168h/249K 100

Window (Hours/SDEs)

RTEC inc % Improvement

Simple Statically Determined

(b) Fluent improvement (5%)

12h/17K 24h/35K 48h/70K 96h/141K 168h/249K 100

Window (Hours/SDEs)

RTEC inc % Improvement

union all intersect all relative complement all

(c) Interval manipulation construct improvement (5%)

12h/16K 24h/33K 48h/67K 96h/138K 168h/246K 0

Window (Hours/SDEs)

Avg Time (in seconds)

RTEC inc RTEC

(d) Total recognition time (10%)

12h/16K 24h/33K 48h/67K 96h/138K 168h/246K 100

Window (Hours/SDEs)

RTEC inc % Improvement

Simple Statically Determined

(e) Fluent improvement (10%)

12h/16K 24h/33K 48h/67K 96h/138K 168h/246K 100

Window (Hours/SDEs)

RTEC inc % Improvement

union all intersect all relative complement all

(f) Interval manipulation construct improvement (10%)

12h/14K 24h/30K 48h/62K 96h/131K 168h/238K 0

Window (Hours/SDEs)

Avg Time (in seconds)

RTEC inc RTEC

(g) Total recognition time (20%)

12h/14K 24h/30K 48h/62K 96h/131K 168h/238K 100

Window (Hours/SDEs)

RTEC inc % Improvement

Simple Statically Determined

(h) Fluent improvement (20%)

12h/14K 24h/30K 48h/62K 96h/131K 168h/238K 100

Window (Hours/SDEs)

RTEC inc % Improvement

union all intersect all relative complement all

(i) Interval manipulation construct improvement (20%)

12h/11K 24h/23K 48h/51K 96h/117K 168h/224K 0

Window (Hours/SDEs)

Avg Time (in seconds)

RTEC inc RTEC

(j) Total recognition time (40%)

12h/11K 24h/23K 48h/51K 96h/117K 168h/224K 100

Window (Hours/SDEs)

RTEC inc % Improvement

Simple Statically Determined

(k) Fluent improvement (40%)

12h/11K 24h/23K 48h/51K 96h/117K 168h/224K 100

Window (Hours/SDEs)

RTEC inc % Improvement

union all intersect all relative complement all

(l) Interval manipulation construct improvement (40%)

Incremental Event Calculus for Run-Time Reasoning

12h/4K 24h/10K 48h/29K 96h/89K 168h/194K 0

33% 45% 62%

Window (Hours/SDEs)

Avg Time (in seconds)

RTEC inc RTEC

(m) Total recognition time (80%)

12h/4K 24h/10K 48h/29K 96h/89K 168h/194K 100

Window (Hours/SDEs)

RTEC inc % Improvement

Simple Statically Determined

(n) Fluent improvement (80%)

12h/4K 24h/10K 48h/29K 96h/89K 168h/194K 100

Window (Hours/SDEs)

RTEC inc % Improvement

union all intersect all relative complement all

(o) Interval manipulation construct improvement (80%)

Figure 9: Results on the synthetic delays dataset (slide step of 12 hours). Each row of diagrams corresponds to a diﬀerent amount of delayed SDEs (5%, 10%, 20%, 40%, 80%).

to be imposed in a selected event. The average delay time in all settings was approximately 96 hours.

Even though this dataset spans a period of six months, it does not contain a large number of AIS messages for each vessel. We employed large temporal windows, that usually are not found in practice, in order to stress test the CER systems. Figure 9 presents the results where the sliding window varies from 12 to 168 hours while the slide step is constant and equal to 12 hours. Each row of diagrams of Figure 9 corresponds to a diﬀerent amount of delayed events while the diagram columns refer to (a) the total recognition time, (b) the improvement RTECinc brings in the processing time of each ﬂuent type and (c) the improvement RTECinc achieves in the processing time of each interval manipulation construct. The times displayed in the ﬁrst column of diagrams of Figure 9 correspond to the average recognition time. In the second and third columns of the diagrams of Figure 9, positive values indicate an improvement in performance achieved by RTECinc while negative values indicate a performance decrease. In all the diagrams of Figure 9 we state for each sliding window the average number of SDEs falling inside. It should be noted that for a diﬀerent percentage of delayed events, the average number of SDEs may be diﬀerent for the same window size. For example, when 5% of the SDEs are delayed, the 168-hours window includes 249K SDEs on average, while when 10% of the SDEs are delayed, the 168-hours window contains 246K SDEs. This is due to the fact that some of the delayed events arrive to the CER system too late to be taken into consideration.

Regarding recognition time, RTECinc achieves a performance gain for all the percentages of delayed events and window sizes. This performance gain is highlighted for each sliding window with the arrows in each diagram of the ﬁrst column of Figure 9. The numbers on top of the arrows denote the approximate improvement RTECinc brings to the overall recognition. The performance gain becomes more profound as the sliding window size increases. Furthermore, RTECinc has more predictable performance, as indicated by the standard deviations displayed in the left diagrams of Figure 9. RTECinc improves performance even in the case of the 12-hours window, where there is no overlap in consecutive windows. This

Tsilionis, Artikis, & Paliouras

is mainly due to the fact that RTECinc recognizes the cases where interval computation is unnecessary and avoids them.

The second column of Figure 9 shows the improvement of RTECinc as regards the computation of the intervals of simple and statically determined ﬂuents. In the case of statically determined ﬂuents, RTEC has a better performance than RTECinc, regardless of the amount of delayed SDEs and the window size. The third column of Figure 9 refers to the comparison of the three interval manipulation constructs. RTEC s evaluation of the interval manipulation constructs usually demands less amount of time. In the complexity analysis we showed the conditions under which the incremental version of the three interval manipulation constructs can outperform the non-incremental one. These conditions are usually hard to satisfy, mainly for intersect all and relative complement all. The most decisive factor for the performance of RTECinc is the number of intervals the body ﬂuents of a rule have from the previous query time. The higher the number of these intervals, the more probable it is for the incremental procedure to improve performance. In the Brest dataset, the SDEs referring to a vessel are infrequent, and as a consequence the produced intervals of a ﬂuent referring to a vessel may not be great in number.

Even in the cases where RTECinc improves the performance in all interval manipulation constructs, the overall performance concerning statically determined ﬂuents is worse than RTEC. This lies in the fact that RTECinc operates on binary rules to produce the intervals of a statically determined ﬂuent, leading to costly memory operations. RTEC avoids this issue as it is not restricted to binary rules for statically determined ﬂuents.

RTECinc manages to produce the intervals of simple ﬂuents more eﬃciently than RTEC. This is the main reason why RTECinc achieves overall better performance than RTEC. Our complexity analysis showed that the most important factor is the ratio of delayed events to the size of the overlap. The lower this ratio is the higher the gain of RTECinc. By inspecting the performance of RTECinc in the same window but for diﬀerent percentages of delayed events, one may verify that as the number of delayed events increases the improvement in processing time also decreases.

5.2.2 Natural delays

The second dataset concerns 34K vessels sailing in the European seas in January 2016 (see Figure 8(b) on page 999). The compression-annotation of the AIS messages and subsequently the spatial processing yielded in total 17M SDEs. As already mentioned, this dataset retains the natural delays and permits us to test the recognition systems in real life conditions. Additionally, it allows us to test the scalability of the systems given the large number of vessels and SDEs.

Figure 10 displays our experimental results. We used ﬁve diﬀerent sliding windows, varying from 1 to 16 hours, and a constant step of 1 hour to examine the eﬀect of the overlap in performance. The number of SDEs increases dramatically even for smaller windows compared to the Brest dataset. Both RTECinc and RTEC are capable of supporting realtime CER in this demanding dataset. RTECinc manages again to improve performance in all window sizes and as the overlap of consecutive query times increases, this improvement is more pronounced (Figure 10(a)). Again, the arrows demonstrate the improvement RTECinc

Incremental Event Calculus for Run-Time Reasoning

1h/39K 2h/79K 4h/158K 8h/317K 16h/635K 0

Window (Hours/SDEs)

Avg Time (in minutes)

RTEC inc RTEC

(a) Total recognition time

1h/39K 2h/79K 4h/158K 8h/317K 16h/635K 100

Window (Hours/SDEs)

RTEC inc % Improvement

Simple Statically Determined

(b) Fluent improvement

1h/39K 2h/79K 4h/158K 8h/317K 16h/635K 100

Window (Hours/SDEs)

RTEC inc % Improvement

union all intersect all relative complement all

(c) Imc improvement

Figure 10: Results with the natural delays dataset (slide step of 1 hour).

achieves in each sliding window. Additionally, in contrast to RTEC, RTECinc exhibits a near linear increase in processing time, as the size of the window increases.

In Figure 10(b) we present RTECinc s improvement in processing time for statically determined and simple ﬂuents, compared to RTEC. In contrast to the experiments performed with the synthetic delays dataset, RTECinc now manages to improve performance in all windows for statically determined ﬂuents. The great increase in SDEs allows the incremental procedure to achieve the observed performance gain. In Figure 10(c) the incremental operations achieve better performance in 4, 8 and 16-hours windows, illustrating that as the overlap increases the eﬀects of the incremental process on performance are more pronounced.

Regarding the processing of simple ﬂuents, RTECinc again signiﬁcantly enhances the performance of RTEC. As the overlap of consecutive query times increases, the performance gain of RTECinc over RTEC increases. This result validates further our complexity analysis and highlights the importance of the ratio of delayed events to overlapping events.

5.3 Fleet Management

The European economy relies to a great extent on commercial vehicle ﬂeets. According to the European Automobile Manufacturers Association9, there were over 54 Million commercial vehicles in use in Europe in 2015, and this number is growing every year. Commercial vehicles are equipped with devices emitting information regarding their location and operational status, such as speed and fuel level. Fleet management applications collect the information emitted from moving vehicles in order to improve the management and planning of transportation services, and enable informed decision-making. Detecting CEs from such data streams can be beneﬁcial for the drivers of commercial vehicles, since they can be informed about their performance, and even prevent dangerous situations. Additionally, the analysis of data generated by such a ﬂeet of vehicles can help the owners maximize the performance of the ﬂeet.

9. http://www.acea.be/statistics/article/vehicles-in-use-europe-2017

Tsilionis, Artikis, & Paliouras

In the context of the Track & Know10 project, we developed an online ﬂeet management system for the recognition of CEs, that improves the operating eﬃciency of a commercial ﬂeet. Our system utilises the GPS (Global Positioning System) traces of moving vehicles along with information emitted by installed accelerometer devices, such as abrupt acceleration, and information concerning the level of fuel in a vehicle s tank provided by a fuel sensor. These traces are enriched with weather and Point-of-Interest (POI) information by a dedicated component for data enrichment (Tsilionis et al., 2019b). The enriched data stream is transmitted to the CER module, in order to recognize various types of vehicle activity. All such activities have been formalized in collaboration with the domain experts of the Track & Know project. The event description includes four simple ﬂuents. The rule-set below presents a simple ﬂuent formalisation that detects opportunities for refueling:

initiated At(re Fuel Opportunity(V )=true, T)

happens At(close To Gas(V ), T),

holds At(high Speed(V )=true, T),

happens At(fuel Level(V, L), T),

threshold(V, fuel, Vtank), L < Vtank/2.

terminated At(re Fuel Opportunity(V )=true, T)

happens At(fuel Level(V, L), T),

threshold(V, fuel, Vtank), L Vtank/2.

Refueling opportunities can be helpful for companies owning commercial ﬂeets, since they place emphasis on fuel consumption. close To Gas(V ) is a spatial relation computed by the data enrichment module, indicating that a vehicle V is close to a gas station. high Speed(V ) denotes that a vehicle V is moving with a speed greater than a user-speciﬁed threshold. fuel Level(V, L) is an SDE emitted by the fuel sensor of each vehicle. threshold(V, fuel, Vtank) records the tank size of vehicles. According to rule-set (27), our system starts ﬂagging that vehicle V should refuel when it is close to a gas station, its speed is above the user-speciﬁed threshold (implying uneconomic driving), and the fuel level is lower than half of the tank size. Moreover, we stop ﬂagging the need to refuel when the fuel is more than half of the tank size. We evaluated RTECinc on real-world positional data of vehicles, provided by Vodafone Innovus11, our partner in the Track & Know project, which oﬀers ﬂeet management services. The dataset concerns approx. 6K vehicles traveling in the European road network and spans from 30 June 2018 to 8 August 2018. Each record of the data is enriched with weather data and proximity to POIs, leading to approx. 70M SDEs. This large number of SDEs allows us to test further the scalability of the CER engines. Since the original dataset is temporally sorted, a strategy of artiﬁcially injecting delays was followed as that described in Section 5.2.1. We performed ﬁve experiments, each time varying the amount of SDEs being delayed (5%, 10%, 20%, 40% and 80%). The average delay time, in all settings, is approximately 8 hours.

10. https://trackandknowproject.eu/ 11. https://www.vodafoneinnovus.com/

Incremental Event Calculus for Run-Time Reasoning

1h/88K 2h/176K 4h/356K 8h/724K 16h/1470K 0

85% 88% 81%

Window (Hours/SDEs)

Avg Time (in minutes)

RTEC inc RTEC

(a) Total recognition time (5%)

1h/83K 2h/168K 4h/342K 8h/706K 16h/1451K 0

85% 88% 76%

Window (Hours/SDEs)

Avg Time (in minutes)

RTEC inc RTEC

(b) Total recognition time (10%)

1h/74K 2h/151K 4h/315K 8h/671K 16h/1414K 0

Window (Hours/SDEs)

Avg Time (in minutes)

RTEC inc RTEC

(c) Total recognition time (20%)

1h/56K 2h/118K 4h/262K 8h/601K 16h/1340K 0

Window (Hours/SDEs)

Avg Time (in minutes)

RTEC inc RTEC

(d) Total recognition time (40%)

1h/21K 2h/52K 4h/154K 8h/461K 16h/1191K 0

Window (Hours/SDEs)

Avg Time (in minutes)

RTEC inc RTEC

(e) Total recognition time (80%)

Figure 11: Results on the ﬂeet management dataset (slide step of 1 hour). Each diagram corresponds to a diﬀerent amount of delayed SDEs (5%, 10%, 20%, 40%, 80%).

Figure 11 presents the results regarding recognition time, where the sliding window varies from 1 to 16 hours and the slide step is constant and equal to 1 hour. Each diagram corresponds to a diﬀerent amount of delayed events. RTECinc outperforms RTEC in all experiments. The performance improvement becomes more profound as the overlap between consecutive windows increases. Both RTEC and RTECinc exhibit variations in their performance; as the number of delayed events increases the intervals of a ﬂuent are more susceptible to changes. Moreover, as the percentage of delayed events decreases it is more probable for RTECinc to achieve performance gain.

6. Related Work

The goal of CER systems is to identify CEs of interest given as input a stream of timestamped SDEs, such as events coming from sensors of vessels or vehicles. In the literature, numerous CER systems and languages have been proposed. See (Cugola & Margara, 2012; Alevizos et al., 2017; Giatrakos et al., 2020) for three surveys. These systems diﬀer in their architectures, data models, pattern languages and processing mechanisms (Grez et al., 2019, 2020). For example, many CER methods are based on automata (Demers et al., 2007; Zhang et al., 2014; Schultz-Møller et al., 2009; Apache Flink CEP12). The automaton model is used

12. https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/cep.html

Tsilionis, Artikis, & Paliouras

to describe the semantics of the language as well as the recognition algorithm. Apart from automata, some CER systems employ tree-based models (Liu et al., 2011; Mei & Madden, 2009). Again, tree-based formalisms are used for both modeling and pattern matching, i.e. they may describe the event patterns and the applied recognition algorithm. Logic-based approaches to CER have also been attracting considerable attention, since they exhibit a formal, declarative semantics, and at the same time support eﬃcient reasoning (Fern et al., 2002; Dousson & Maigat, 2007; Cugola & Margara, 2010; Paschke & Bichler, 2008; Brandt et al., 2018). RTEC is a logic programming implementation of the Event Calculus (Kowalski & Sergot, 1986), that has been used for CER in various application domains, such as maritime situational awareness (Patroumpas et al., 2017; Pitsikalis et al., 2019) and ﬂeet management (Tsilionis et al., 2019b). The use of the Event Calculus facilitates considerably the development of succinct CE deﬁnitions by taking advantage of the built-in representation of inertia, favors the interaction between CE developer and domain expert and supports code maintenance. RTEC explicitly represents CE intervals (unlike e.g. Dousson & Maigat, 2007; Cugola & Margara, 2010; Beck et al., 2018) and thus avoids the related logical problems (Paschke, 2006). Furthermore, RTEC supports not only complex temporal constraints, but also complex atemporal constraints, as well as reasoning over background knowledge. This is in contrast to other CER systems proposed in the literature (Arasu et al., 2006; Dousson & Maigat, 2007; Cugola & Margara, 2010; Anicic et al., 2012). Moreover, and in contrast to state-of-the-art recognition systems, such as the Esper13 engine and SASE14 (Zhang et al., 2014), RTEC can naturally express hierarchical knowledge by means of well-structured speciﬁcations, and consequently employ caching techniques to avoid unnecessary re-computations. Concerning the Event Calculus literature, a key feature of RTEC is that it includes a windowing technique. No other Event Calculus system (including Chittaro & Montanari, 1996; Cervesato & Montanari, 2000; Miller & Shanahan, 2002; Paschke & Bichler, 2008; Lee & Palla, 2012; Montali et al., 2013) forgets or represents concisely the data stream history. Another feature of RTEC is the ability to deal with out-of-order and/or retracted SDE streams through its windowing mechanism. The processing of events under the assumption that the stream is temporally sorted is a serious limitation of several other CER systems (Gyllstrom et al., 2006; Cugola & Margara, 2010, 2012; Dindar et al., 2011; Li et al., 2011; Beck et al., 2018), since they cannot update or recognize new CEs as a result of delayed arrival or retraction of SDEs. One limitation of RTEC is the computation from scratch strategy, which results to ineﬃciencies. In this work, we presented an incremental version of RTEC, i.e. RTECinc, that overcomes this issue. The techniques presented for incremental reasoning are based on the incremental maintenance of materialised views in deductive databases. A materialised view in a deductive database is the process of computing the eﬀects of explicit facts on rules of a program and storing the result (Gupta et al., 1993; Motik et al., 2019). Explicit facts along with the implicit facts derived from the rules form the knowledge base of the database. Explicit facts can be seen as the SDEs that are the input in a CER system. The explicit facts can be updated through additions or deletions and thus, a new materialisation has to be produced. However, the update of explicit facts may or may not aﬀect the inferences produced after

13. http://www.espertech.com/esper/ 14. http://sase.cs.umass.edu//

Incremental Event Calculus for Run-Time Reasoning

rule evaluation. The goal of incremental maintenance is to compute the new materialisation, using the updates and the state of the database before the updates. The result is the minimization of the overall cost. In order for this operation to be eﬃcient, incremental maintenance has to detect and deal only with those rule instances that require update. Notice that incremental materialisation is not always preferable over materialisation from scratch. Consider, for example, the case in which large amounts of explicit facts are inserted and deleted from the database (Motik et al., 2019; Gupta et al., 1993).

An update may cause a rule that ﬁred before the update to stop ﬁring or vice versa. The changes in the inferences induced by the updates have to be applied to the old materialization, in order to produce the new one. Moreover, the changes introduced by the updates are expressed as rules, i.e. rewritings of the original rules of the program capturing the diﬀerence between consecutive materialisations. In the literature, they are called delta rules and in Section 3, based on an initiated At rule, we provide the delta rules that compute the changes that have to be made in the initiation points of a ﬂuent. Diﬀerent incremental maintenance techniques have been proposed for the evaluation of the delta rules.

Ceri and Widom (1991) have presented a technique for maintaining views in relational databases with the use of production rules (similar to the delta rules). The rewriting technique has also been used to show that a recursive relation can be expressed as a nonrecursive ﬁrst-order query with respect to insertions (Dong & Topor, 1992; Dong & Su, 1994; Dong et al., 1995). Dong and Su (1995) discuss non-recursive methods to update the transitive closure of various graphs with respect to deletions and insertions. However, none of these approaches handles negation.

Another approach towards incremental maintenance is the counting algorithm (Gupta et al., 1993; Motik et al., 2019). In counting, every fact (explicit or implicit) is associated with a counter that denotes the number of times it has been derived. The counter is stored in the materialised view. When an update takes place, the counting algorithm uses the changes made to the explicit facts, as well as the previous state of the materialisation, in order to compute the changes that have to be made in the inferences. These changes are reﬂected in the counter of each fact. When a counter becomes zero the fact must be deleted from the database. On the other hand, when the counter is positive, the fact must remain or be inserted in the view. The limitations of the counting algorithm are the inability to handle recursive rules and the potential memory overhead caused by storing the counters (Hu et al., 2018; Motik et al., 2019).

The Delete/Rederive (DRed) (Gupta et al., 1993; Motik et al., 2019) algorithm is a diﬀerent approach that deals with recursion and does not depend on additional information, in order to decide if a fact has to be removed from or be inserted in the view. DRed ﬁrst over-deletes all implicit facts that depend on updated facts, that is, it evaluates all rules whose body contains an updated fact. This is called the over-deletion step. Then, it adopts a re-derivation step, during which the algorithm tries to ﬁnd alternative derivations for each deleted fact. The body of each rule, whose head can be matched to the deleted fact, is evaluated over the new materialisation and if the evaluation succeeds the fact is inserted again to the view. The re-derivation step thus introduces a form of redundancy. The counting algorithm avoids this ineﬃciency since the counter s value determines the presence of a fact in the view (Hu et al., 2018; Motik et al., 2019).

Tsilionis, Artikis, & Paliouras

In approaches where duplicate semantics (i.e. storing a derived fact as many times as it is inferred) is not desired, as in the case of RTEC, the counting algorithm, in combination with stratum by stratum evaluation, is preferred over DRed (Motik et al., 2019). The counter of a predicate denotes how many times the fact is derived within a stratum. Predicates of higher strata that depend on it, do not need to be evaluated as many times as the value of the counter, but only if the counter has value 1 or 0, that is, if it exists or not in the view. In this manner duplicate elimination, which is an expensive operation, can be achieved at little or no extra cost (Gupta et al., 1993). Finally, counting and DRed can handle negation in the body literals of a rule (Motik et al., 2019). Negation in rule bodies is very important in CER since the absence of an event may lead to the ﬁring of the rule. Additionally, during updates the delayed arrival of an event may lead to the deletion of a rule derivation. Motik et al. (2015) presented the Backward/Forward (BF) algorithm, which in certain cases overcomes the redundancy introduced during the re-derivation step of DRed. If a fact is considered for deletion, BF searches for alternative derivations and only if one cannot be found it proceeds with the deletion. Hu et al. (2018) proposed two hybrid approaches, combining DRed and BF with counting. DRed with counting associates with each fact two counters, one for the non-recursive and one for the recursive rules. A fact is never over-deleted if its non-recursive counter is positive, while if this counter equals to zero the recursive counter must also equals to zero in order for the fact to be removed from the view. If the recursive counter diﬀers from zero the fact remains in the view. Thus, the rederivation step is replaced with a simple check at the recursive counter. In order to reduce the amount of time spent in discovering alternative derivations of a fact, BF with counting associates with each fact a non-recursive counter. Only if the non-recursive counter equals to zero it evaluates recursive rules to examine if the fact still holds. In the presence of a lot of recursive rules BF with counting is the preferred choice since it ﬁnds alternative derivations and the cost of over-deletion is reduced. However, when the number of recursive rules is small, over-deletion favors computation and thus, DRed with counting is the best choice. Notice that in the absence of recursive rules DRed with counting is identical to the original counting algorithm. An improvement of DRed with counting, regarding recursive rules, can be found in Hu et al. (2019). All the approaches presented above refer to databases which are usually static and do not undergo great changes. Our work transfers these ideas to a streaming environment, which is dynamic in nature and the underlying facts, i.e. SDEs, are constantly changing. Query answering in streaming environments needs to be fast and make use of the available limited resources. Stream management systems achieve this by employing window mechanisms that restrict the analysis in a ﬁnite part of the stream. Our method, RTECinc, takes advantage of the window mechanism of RTEC, as well as its ability to handle temporal reasoning, and applies incremental maintenance techniques in order to deal with out-of-order events. Additionally, the logic programs used in our experimental evaluation contain many rules and thus RTECinc supports multi-query answering. The ideas of DRed and rewriting of rules have been implemented in a stream reasoning framework by Barbieri et al. (2010). These authors treat insertions with the use of incremental maintenance rules. However, they do not support negation in their rules, they consider the stream temporally sorted and their logic program consists only of one rule. Another interesting stream reasoning approach that is based on Temporal Datalog is proposed by Ronca et al. (2018). In this

Incremental Event Calculus for Run-Time Reasoning

case, a window construct is presented, that forgets old and accepts new input only if the stream is temporally sorted. Furthermore, negation is not supported. DBToaster is a method that performs Higher-Order incremental view maintenance, meaning that it does not store only the result of the query but also stores the result of subqueries (Koch et al., 2014). This way it achieves considerable speed-ups in updating a materialisation compared to the methods outlined above but in the cost of increased memory consumption. In streaming environments where there are limited resources and near real-time analysis is required, this is prohibitive. Idris et al. (2019) developed a method that maintains an indexing structure that can produce through constant delay enumeration the database materialisation without the need to store the answer of the query and the results of its subqueries. Furthermore, depending on the type of the query it achieves fast updating times of the materialised view. This approach is very eﬃcient but the experimental evaluation always concerns the processing of a single query. Therefore, it is not clear how it can address large (hierarchical) knowledge bases. A work that is closely related to ours, in the sense that it concerns the Event Calculus and the minimization of redundant interval computations, is the Cached Event Calculus (CEC) (Chittaro & Montanari, 1996). A key diﬀerence between our approach and CEC is that we propagate changes stratum by stratum, as opposed to sequentially propagating the eﬀects of individual facts. Therefore, RTECinc could scale to the datasets presented in the previous section, while CEC could not. The abundance of sensors that transmit data in high rates in the recent years has also led to the development of distributed event-based systems. Mutschler and Philippsen (2014) propose a hybrid method that combines buﬀering and speculative techniques to deal with out-of-order event arrival in distributed environments. This method concerns a middleware that is responsible for sending the input to the event detectors and is blind to the recognition algorithm used by event detectors. Ajileye et al. (2019) presented a distributed materialisation of Datalog rules, where dynamic data exchange reduces the network communication. This is an interesting approach that does not consider, however, negation and incremental materialisation.

7. Summary and Future Work

We presented RTECinc, an incremental version of the Event Calculus for Run-Time reasoning. RTECinc is a formal computational framework for incremental event recognition which deals eﬃciently with the delayed arrival and retraction of SDEs. Using techniques reminiscent of the incremental maintenance of deductive databases, RTECinc avoids unnecessary calculations and improves performance. Our empirical evaluation on big synthetic and real datasets from the ﬁelds of maritime situational awareness and ﬂeet management demonstrated the eﬀectiveness of the incremental procedure. Our intention is to compare experimentally the two methods in other application domains where the granularity of time is completely diﬀerent. Furthermore, we intend to compare our incremental procedure with other state-of-the-art frameworks. Finally, besides devising optimisations that can boost performance, such as distributed incremental reasoning, we would like to extend our method in order to handle recursive deﬁnitions and events with delayed eﬀects, which are necessary in various applications (Pitsikalis et al., 2019).

Tsilionis, Artikis, & Paliouras

Acknowledgments

This work was funded partly by the EU H2020 Track & Know Big Data for Mobility Tracking Knowledge Extraction in Urban Areas and partly by the EU H2020 INFORE Interactive Extreme-Scale Analytics and Forecasting projects.

We present the built-in axioms of RTEC ensuring that a ﬂuent F may have at most one value at each time-point, the derivations of the incremental evaluation formulas concerning statically determined ﬂuents, as well the complete tables expressing the comparison of RTEC and RTECinc regarding the three interval manipulation constructs.

Built-in Axioms Concerning Fluent-Value Pairs

Consider the axioms below:

broken(F=V, Ts, T)

terminated At(F=V, Tf), Ts < Tf T. (a)

broken(F=V1, Ts, T)

initiated At(F=V2, Tf), Ts < Tf T, V1 = V2. (b)

broken(F=V, Ts, T) represents that a maximal interval starting at Ts for which F=V holds continuously is terminated at some time Tf such that Ts < Tf T. According to rule (a), F=V is broken if it is terminated. According to rule (b), if F=V2 is initiated at Tf then eﬀectively F=V1 is terminated at time Tf, for all other possible values V1 of F. Rule (b) ensures therefore that a ﬂuent cannot have more than one value at any time. We do not insist that a ﬂuent must have a value at every time-point. There is a diﬀerence between initiating a Boolean ﬂuent F=false and terminating F=true: the former implies, but is not implied by the latter.

Proofs of the Incremental Evaluation of Interval Manipulation Constructs

We present the derivations of the incremental evaluation formulas of RTECinc concerning the interval manipulation constructs. Table 16 displays a list of properties (Bassiouni et al., 1993) concerning sets of intervals, that are used in the derivations (property numbers are displayed below the horizontal curly brackets in the derivations).

Table 16: Properties of temporal sets. Capital letters represent sets of time intervals.

1 (a) A T B = B T A

(b) A T B = B T A

Incremental Event Calculus for Run-Time Reasoning

2 (a) A T (B T C) = (A T B) T (A T C)

(b) A T (B T C) = (A T B) T (A T C)

3 (a) A T (A T B) = A

(b) A T (A T B) = A

4 A T B = A \T (A \T B)

5 A \T (B T C) = (A T C) \T (B T C) =

(A \T B) \T C = (A \T C) \T B

6 (a) (A T B) \T C = (A \T C) T (B \T C)

(b) (A T B) \T C = (A \T C) T (B \T C)

7 A \T (B \T C) = (A T C) T (A \T B)

8 (A \T B) T C = (A T C) \T (B \T C)

9 (A \T B) T C = (A T C) \T B = (A T C) \T (B T C)

10 (a) A \T (B T C) = (A \T B) T (A \T C)

(b) A \T (B T C) = (A \T B) T (A \T C)

11 (a) (A \T B) T A = A, (b) (A \T B) T B = A T B

(c) (A \T B) T A = A \T B, (d) (A \T B) T B =

Derivation of Eq. (15) expressing the incremental evaluation of union all

IQi A T IQi B =

(I Qi 1 A \T I A ) T I+ A

| {z } Eq. (12)

(I Qi 1 B \T I B ) T I+ B

| {z } Eq. (12)

(I Qi 1 B \T I B ) T (I Qi 1 A \T I A ) T (I+ A T I+ B) | {z } Pr.1a

(I Qi 1 B \T I B ) T

h I Qi 1 A T (I Qi 1 A T I Qi 1 B ) i

| {z } Pr.3b

T (I+ A T I+ B) =

(I Qi 1 B \T I B ) T

I Qi 1 A T h (I Qi 1 A T I Qi 1 B ) \T I A i

| {z } Pr.9

T (I+ A T I+ B) =

Tsilionis, Artikis, & Paliouras

" I Qi 1 A T (I Qi 1 B \T I B ) T

(I Qi 1 B \T I B ) T h (I Qi 1 A T I Qi 1 B ) \T I A i #

| {z } Pr.2a

T (I+ A T I+ B) =

" I Qi 1 A T (I Qi 1 A \T I B ) | {z } Pr.11a

T (I Qi 1 B \T I B ) T

h (I Qi 1 A T I Qi 1 B ) T I Qi 1 B

| {z } Pr.3b

\T I B i T h (I Qi 1 A T I Qi 1 B ) \T I A i #

T (I+ A T I+ B) =

" I Qi 1 A T h (I Qi 1 A T I Qi 1 B ) \T I B i

| {z } Pr.6a

h (I Qi 1 A T I Qi 1 B ) T (I Qi 1 B \T I B ) i

| {z } Pr.9

T h (I Qi 1 A T I Qi 1 B ) \T I A i #

T (I+ A T I+ B) =

" h (I Qi 1 A T I Qi 1 B ) T I Qi 1 A i

| {z } Pr.3b

T h (I Qi 1 A T I Qi 1 B ) \T I B i T

(I Qi 1 A T I Qi 1 B ) \T h I A \T (I Qi 1 B \T I B ) i

| {z } Pr.7

T (I+ A T I+ B) =

" (I Qi 1 A T I Qi 1 B ) \T (I B \T I Qi 1 A )

| {z } Pr.7

(I Qi 1 A T I Qi 1 B ) \T h I A \T (I Qi 1 B \T I B ) i #

T (I+ A T I+ B) =

(I Qi 1 A T I Qi 1 B ) \T

(I B \T I Qi 1 A ) T h I A \T (I Qi 1 B \T I B ) i #

| {z } Pr.10b

T (I+ A T I+ B) =

(I Qi 1 A T I Qi 1 B ) \T

(I B \T I Qi 1 A ) T h I A T (I A T I B )

| {z } Pr.3b

\T (I Qi 1 B \T I B ) i #

T (I+ A T I+ B) =

(I Qi 1 A T I Qi 1 B ) \T

(I B \T I Qi 1 A ) T h I A T (I A T I B ) \T (I Qi 1 B \T I B ) i

| {z } Pr.9

T (I+ A T I+ B) =

(I Qi 1 A T I Qi 1 B ) \T

(I B \T I Qi 1 A ) T h (I A \T I Qi 1 B ) T I A i

| {z } Pr.11a

T h (I A \T I Qi 1 B ) T I B i

| {z } Pr.8

T (I+ A T I+ B) =

(I Qi 1 A T I Qi 1 B ) \T

(I B \T I Qi 1 A ) T (I A \T I Qi 1 B ) T (I A T I B ) | {z } Pr.2a

T (I+ A T I+ B) =

(I Qi 1 A T I Qi 1 B ) \T

h (I A \T I Qi 1 A ) T (I B \T I Qi 1 A ) T (I A \T I Qi 1 B ) T (I B \T I Qi 1 B ) i

I Qi 1 F T I F = I F I F \T I Qi 1 F =

T (I A T I B ) #

T (I+ A T I+ B) =

(I Qi 1 A T I Qi 1 B ) \T

h [(I A T I B ) \T I Qi 1 A ] T [(I A T I B ) \T I Qi 1 B ] i

| {z } Pr.6a

T (I A T I B ) #

T (I+ A T I+ B) =

(I Qi 1 A T I Qi 1 B ) \T

h (I A T I B ) \T (I Qi 1 A T I Qi 1 B ) i

| {z } Pr.10a

T (I A T I B ) #

T (I+ A T I+ B)

Incremental Event Calculus for Run-Time Reasoning

Derivation of Eq. (19) expressing the incremental evaluation of intersect all

IQi A T IQi B =

(I Qi 1 A \T I A ) T I+ A

| {z } Eq. (12)

(I Qi 1 B \T I B ) T I+ B

| {z } Eq. (12)

h (I Qi 1 A \T I A ) T I+ A i T h (I Qi 1 B \T I B ) T (I Qi 1 A \T I A ) T I+ A i

| {z } Pr.3b

T h (I Qi 1 B \T I B ) T I+ B) i T h (I Qi 1 B \T I B ) T (I Qi 1 A \T I A ) T I+ B) i

| {z } Pr.3b

h (I Qi 1 A \T I A ) T I+ A i T h (I Qi 1 B \T I B ) T I+ B) i T h (I Qi 1 B \T I B ) T (I Qi 1 A \T I A ) T I+ A i T h (I Qi 1 B \T I B ) T (I Qi 1 A \T I A ) T I+ B) i

| {z } Pr.1b

h (I Qi 1 A \T I A ) T I+ A T (I+ A T I+ B) | {z } Pr.3a

i T h (I Qi 1 B \T I B ) T I+ B T (I+ A T I+ B) | {z } Pr.3a

i T h (I Qi 1 B \T I B ) T (I Qi 1 A \T I A ) T (I+ A T I+ B) i

| {z } Pr.2a

h (I Qi 1 A \T I A ) T I+ A i T h (I Qi 1 B \T I B ) T I+ B i T h (I Qi 1 B \T I B ) T (I Qi 1 A \T I A ) i T (I+ A T I+ B) | {z } Pr.2a

" (I Qi 1 A \T I A ) T h I+ B T (I Qi 1 A \T I A ) i

| {z } Pr.3a

T h (I Qi 1 B \T I B ) T I+ B i T h (I Qi 1 B \T I B ) T (I Qi 1 A \T I A ) i#

T (I+ A T I+ B) =

" (I Qi 1 A \T I A ) T I+ A T h I+ B T (I Qi 1 A \T I A ) i

| {z } Pr.1a

(I Qi 1 B \T I B ) T h I+ B T (I Qi 1 A \T I A ) i

| {z } Pr.2a

T (I+ A T I+ B) =

h (I Qi 1 A \T I A ) T I+ A i T (I Qi 1 B \T I B ) T h I+ B T (I Qi 1 A \T I A ) i

| {z } Pr.2a

T (I+ A T I+ B) =

(I Qi 1 A \T I A ) T (I Qi 1 B \T I B ) T h I+ A T (I Qi 1 B \T I B ) i

| {z } Pr.2b

T h I+ B T (I Qi 1 A \T I A ) i T (I+ A T I+ B) =

(I Qi 1 A \T I A ) T (I Qi 1 B \T I B ) T h I+ A T (I Qi 1 B \T I B ) i T h I+ B T (I Qi 1 A \T I A ) i T (I+ A T I+ B) T (I+ A T I+ B) | {z } IF T IF = IF

(I Qi 1 A \T I A ) T (I Qi 1 B \T I B ) T

h I+ B T (I Qi 1 A \T I A ) i T (I+ A T I+ B) T

h I+ A T (I Qi 1 B \T I B ) i T (I+ A T I+ B)

| {z } Pr.1a

(I Qi 1 A \T I A ) T (I Qi 1 B \T I B ) T

h I Qi 1 A T (I+ B \T I A ) | {z } Pr.9

i T (I+ A T I+ B) T

h I Qi 1 B T (I+ A \T I B ) | {z } Pr.9

i T (I+ A T I+ B) =

(I Qi 1 A \T I A ) T (I Qi 1 B \T I B ) T

h I Qi 1 A T I+ B T (I+ B \T I A ) | {z } Pr.11c

i T (I+ A T I+ B) T

h I Qi 1 B T I+ A T (I+ A \T I B ) | {z } Pr.11c

i T (I+ A T I+ B) =

(I Qi 1 A \T I A ) T (I Qi 1 B \T I B ) T

h I Qi 1 A T (I+ B \T I Qi 1 B ) | {z }

I Qi 1 F T I+ F =

T (I+ B \T I A ) i T (I+ A T I+ B) T

h I Qi 1 B T (I+ A \T I Qi 1 A ) | {z }

I Qi 1 F T I+ F =

T (I+ A \T I B ) i T (I+ A T I+ B) =

Tsilionis, Artikis, & Paliouras

(I Qi 1 A \T I A ) T (I Qi 1 B \T I B ) T

h I Qi 1 A T [I+ B \T (I Qi 1 B T I A )] | {z } Pr.10b

i T (I+ A T I+ B) T

h I Qi 1 B T [I+ A \T (I Qi 1 A T I B )] | {z } Pr.10b

i T (I+ A T I+ B) =

(I Qi 1 A \T I A ) T I Qi 1 B T (I Qi 1 B \T I B ) T I Qi 1 A i

| {z } Pr.11c

h I+ B T [I Qi 1 A \T (I Qi 1 B T I A )] | {z } Pr.9

i T (I+ A T I+ B) T

h I+ A T [I Qi 1 B \T (I Qi 1 A T I B )] i

| {z } Pr.9

T (I+ A T I+ B) =

h (I Qi 1 A T I Qi 1 B ) \T I A i

| {z } Pr.9

T h (I Qi 1 A T I Qi 1 B ) \T I B i

| {z } Pr.9

I+ B T h I Qi 1 A \T (I Qi 1 B T I A ) T I+ A i

| {z } Pr.2b

I+ A T h I Qi 1 B \T (I Qi 1 A T I B ) T I+ B i

| {z } Pr.2b

(I Qi 1 A T I Qi 1 B ) \T (I A T I B ) | {z } Pr.10b

I+ B T h (I Qi 1 A \T I Qi 1 B ) \T I A

| {z } Pr.5

I+ A T h (I Qi 1 B \T I Qi 1 A ) \T I B

| {z } Pr.5

(I Qi 1 A T I Qi 1 B ) \T (I A T I B ) T

I+ B T h ( I Qi 1 A\T B | {z }

I Qi 1 A\T B =(I Qi 1 A \T I Qi 1 B )

\T I A ) T I+ A i T

I+ A T h ( I Qi 1 B\T A | {z }

I Qi 1 B\T A =(I Qi 1 B \T I Qi 1 A )

\T I B ) T I+ B i

Derivation of Eq. (23) expressing the incremental evaluation of

relative complement all

IQi A \T IQi B =

(I Qi 1 A \T I A ) T I+ A

| {z } Eq. (12)

(I Qi 1 B \T I B ) T I+ B

| {z } Eq. (12)

h (I Qi 1 A \T I A ) I+ A i T h (I Qi 1 A \T I A ) T I+ A T I B i

\T h (I Qi 1 B \T I B ) T I+ B i =

(I Qi 1 A \T I A ) I+ A

h (I Qi 1 A \T I A ) T I+ A T I B i \T h (I Qi 1 B \T I B ) T I+ B i

| {z } Pr.9

(I Qi 1 A \T I A ) I+ A T h I+ A \T (I Qi 1 B \T I B ) T I+ B i

| {z } Pr.11a

h (I Qi 1 A \T I A ) T I B \T (I Qi 1 B \T I B ) T I+ B i T h I+ A \T (I Qi 1 B \T I B ) T I+ B i

| {z } Pr.6a

h (I Qi 1 A \T I A ) I+ A i T h (I Qi 1 A \T I A ) T I B \T (I Qi 1 B \T I B ) T I+ B i T

I+ A \T (I Qi 1 B \T I B ) T I+ B

| {z } Pr.2a

h (I Qi 1 A \T I A ) I+ A i T h (I Qi 1 A \T I A ) T I B \T (I Qi 1 B \T I B ) T (I+ B \T I B ) | {z }

I F T I+ F =

I+ A \T (I Qi 1 B \T I B ) T I+ B =

h (I Qi 1 A \T I A ) I+ A i T h (I Qi 1 A \T I A ) T I B \T (I Qi 1 B T I+ B) \T I B

| {z } Pr.6a

I+ A \T (I Qi 1 B \T I B ) T I+ B =

h (I Qi 1 A \T I A ) \T (I Qi 1 B T I+ B) T (I Qi 1 A \T I A ) | {z } Pr.11a

T I+ A i T h (I Qi 1 A \T I A ) \T (I Qi 1 B T I+ B) T I B i

| {z } Pr.8

I+ A \T (I Qi 1 B \T I B ) T I+ B =

Incremental Event Calculus for Run-Time Reasoning

(I Qi 1 A \T I A ) \T (I Qi 1 B T I+ B) T

h (I Qi 1 A \T I A ) T I+ A i T I B

| {z } Pr.2a

I+ A \T h (I Qi 1 B \T I B ) T I+ B i =

(I Qi 1 A \T I A ) \T (I Qi 1 B T I+ B) T

I+ A \T h (I Qi 1 B \T I B ) T I+ B i T

h (I Qi 1 A \T I A ) T I+ A i T I B

| {z } Pr.1a

(I Qi 1 A \T I A ) \T (I Qi 1 B T I+ B) T

h (I+ A T I Qi 1 A ) | {z }

I Qi 1 F T I+ F =

\T I+ B i T h I+ A \T (I Qi 1 B \T I B ) T I+ B i T

h (I Qi 1 A \T I A ) T I+ A i T I B

(I Qi 1 A \T I A ) \T (I Qi 1 B T I+ B) T

h I+ A T (I Qi 1 A \T I+ B) i

| {z } Pr.9

T h I+ A \T (I Qi 1 B \T I B ) T I+ B i T

h (I Qi 1 A \T I A ) T I B i T h I+ A T I B i

| {z } Pr.2b

(I Qi 1 A \T I A ) \T (I Qi 1 B T I+ B) T

I+ A \T h (I Qi 1 B \T I B ) T I+ B \T I Qi 1 A \T I+ B i

| {z } Pr.7

h (I Qi 1 A \T I A ) T I Qi 1 B T I B | {z }

I Qi 1 F T I F = I F

i T h I+ A T I B i =

(I Qi 1 A \T I A ) \T (I Qi 1 B T I+ B) T

I+ A \T h (I Qi 1 B \T I B ) \T I Qi 1 A T I+ B i

| {z } Pr.8

I B T h (I Qi 1 A \T I A ) T I Qi 1 B T I+ A i

| {z } Pr.2b

(I Qi 1 A \T I Qi 1 B ) \T (I A T I+ B)

| {z } Pr.5

I+ A \T h (I Qi 1 B \T I Qi 1 A ) \T I B

| {z } Pr.5

I B T h (I Qi 1 A T I Qi 1 B ) \T I A

| {z } Pr.9

(I Qi 1 A \T I Qi 1 B ) \T (I A T I+ B) T

I+ A \T h ( I Qi 1 B\T A | {z }

I Qi 1 B\T A = I Qi 1 B \T I Qi 1 A

\T I B ) T I+ B i T

I B T h ( I Qi 1 A T B | {z }

I Qi 1 A T B = I Qi 1 A T I Qi 1 B

\T I A ) T I+ A i

Tsilionis, Artikis, & Paliouras

Complete Comparison for Statically Determined Fluents.

Table 17: The 16 diﬀerent cases of union all according to the deletion and insertion sets of the two body ﬂuents. Bold lines highlight the cases in which RTECinc outperforms RTEC.

|I A| |I+ A| |I B| |I+ B| RTEC vs RTECinc

1 = 0 = 0 = 0 = 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| 5|I + AB |

2 0 = 0 = 0 = 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| 2|I + AB |

3 = 0 0 = 0 = 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| 4|I + AB |

4 0 0 = 0 = 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| |I + AB |

5 = 0 = 0 0 = 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| 2|I + AB |

6 0 = 0 0 = 0 |I Qi 1 A T B| < 2|I Qi 1 AB | 2|I + AB |

7 = 0 0 0 = 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| |I + AB |

8 0 0 0 = 0 |I Qi 1 A T B| < 2|I Qi 1 AB |

9 = 0 = 0 = 0 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| 4|I + AB |

10 0 = 0 = 0 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| |I + AB |

11 = 0 0 = 0 0 |IQi 1 A T B| 2|IQi 1 AB | 2|IQi 1 A T B| 3|I + AB |

12 0 0 = 0 0 |IQi 1 A T B| 2|IQi 1 AB | 2|IQi 1 A T B| |I + AB |

13 = 0 = 0 0 0 |IQi 1 A T B| |IQi 1 AB | 3

2|IQi 1 A T B| |I + AB |

14 0 = 0 0 0 |I Qi 1 A T B| < 2|I Qi 1 AB |

15 = 0 0 0 0 |IQi 1 A T B| 2|IQi 1 AB | 2|IQi 1 A T B| |I + AB |

Incremental Event Calculus for Run-Time Reasoning

Table 18: The 16 diﬀerent cases of intersect all according to the deletion and insertion sets of the two body ﬂuents. Bold lines highlight the cases in which RTECinc outperforms RTEC.

|I A| |I+ A| |I B| |I+ B| RTEC vs RTECinc

1 = 0 = 0 = 0 = 0 |IQi 1 A T B| |IQi 1 AB | 5|IQi 1 A\T B| 19

2 |I + AB |

2 0 = 0 = 0 = 0 |IQi 1 A T B| |IQi 1 AB | 9

2|IQi 1 A\T B| 6|I + AB |

3 = 0 0 = 0 = 0 |I Qi 1 A T B| < |I Qi 1 AB | 3

2|I Qi 1 A\T B| 2|I + AB |

4 0 0 = 0 = 0 |I Qi 1 A T B| < |I Qi 1 AB | |I Qi 1 A\T B|

5 = 0 = 0 0 = 0 |IQi 1 A T B| |IQi 1 AB | 5

2|IQi 1 A\T B| 6|I + AB |

6 0 = 0 0 = 0 |I Qi 1 A T B| < 2|I Qi 1 AB | 8|I Qi 1 A\T B| 7

7 = 0 0 0 = 0 |I Qi 1 A T B| < |I Qi 1 AB | 3

2|I Qi 1 A\T B| 3

8 0 0 0 = 0 |I Qi 1 A T B| < 2|I Qi 1 AB | 2|I Qi 1 A\T B|

9 = 0 = 0 = 0 0 |I Qi 1 A T B| < |I Qi 1 AB | 3

2|I Qi 1 A\T B| 2|I + AB |

10 0 = 0 = 0 0 |I Qi 1 A T B| < |I Qi 1 AB | 3

2|I Qi 1 A\T B| 3

11 = 0 0 = 0 0 |I Qi 1 A T B| < 2|I Qi 1 AB | 2|I + AB |

12 0 0 = 0 0 |I Qi 1 A T B| < 2|I Qi 1 AB |

13 = 0 = 0 0 0 |I Qi 1 A T B| < |I Qi 1 AB | |I Qi 1 A\T B|

14 0 = 0 0 0 |I Qi 1 A T B| < 2|I Qi 1 AB | 2|I Qi 1 A\T B|

15 = 0 0 0 0 |I Qi 1 A T B| < 2|I Qi 1 AB |

Tsilionis, Artikis, & Paliouras

Table 19: The 16 diﬀerent cases of relative complement all according to the deletion and insertion sets of the two body ﬂuents. Bold lines highlight the cases in which RTECinc outperforms RTEC.

|I A| |I+ A| |I B| |I+ B| RTEC vs RTECinc

1 = 0 = 0 = 0 = 0 |IQi 1 A\T B| |IQi 1 AB | 5

2(|IQi 1 B\T A| + |IQi 1 A T B|) 8|I + AB |

2 0 = 0 = 0 = 0 |IQi 1 A\T B| |IQi 1 AB | 5

2|IQi 1 B\T A| 2|IQi 1 A T B| 9

3 = 0 0 = 0 = 0 |I Qi 1 A\T B| < |I Qi 1 AB | 3

2|I Qi 1 A T B| 5

4 0 0 = 0 = 0 |I Qi 1 A\T B| < |I Qi 1 AB | |I Qi 1 A T B|

5 = 0 = 0 0 = 0 |I Qi 1 A\T B| < |I Qi 1 AB | 3

2|I Qi 1 B\T A| 3|I + AB |

6 0 = 0 0 = 0 |I Qi 1 A\T B| < |I Qi 1 AB | 3

2|I Qi 1 B\T A| 2|I + AB |

7 = 0 0 0 = 0 |I Qi 1 A\T B| < 2|I Qi 1 AB | 2|I + AB |

8 0 0 0 = 0 |I Qi 1 A\T B| < 2|I Qi 1 AB |

9 = 0 = 0 = 0 0 |IQi 1 A\T B| |IQi 1 AB | 5

2|IQi 1 A T B| 2|IQi 1 B\T A| 5|I + AB |

10 0 = 0 = 0 0 |IQi 1 A\T B| 2|IQi 1 AB | 4(|IQi 1 B\T A| + |IQi 1 A T B|) 5

11 = 0 0 = 0 0 |I Qi 1 A\T B| < |I Qi 1 AB | 3

2|I Qi 1 A T B| 3

12 0 0 = 0 0 |I Qi 1 A\T B| < 2|I Qi 1 AB | 2|I Qi 1 A T B|

13 = 0 = 0 0 0 |I Qi 1 A\T B| < |I Qi 1 AB | |I Qi 1 B\T A| |I + AB |

14 0 = 0 0 0 |I Qi 1 A\T B| < 2|I Qi 1 AB | 2|I Qi 1 B\T B| |I + AB |

15 = 0 0 0 0 |I Qi 1 A\T B| < 2|I Qi 1 AB |

Incremental Event Calculus for Run-Time Reasoning

Ajileye, T., Motik, B., & Horrocks, I. (2019). Datalog materialisation in distributed RDF stores with dynamic data exchange. In The Semantic Web - ISWC 2019 - 18th International Semantic Web Conference, Auckland, New Zealand, October 26-30, 2019, Proceedings, Part I, pp. 21 37.

Alevizos, E., Skarlatidis, A., Artikis, A., & Paliouras, G. (2017). Probabilistic complex event recognition: A survey. ACM Comput. Surv., 50(5), 71:1 71:31.

Anicic, D., Rudolph, S., Fodor, P., & Stojanovic, N. (2012). Real-time complex event recognition and reasoning - a logic programming approach. Applied Artiﬁcial Intelligence, 26(1-2), 6 57.

Arasu, A., Babu, S., & Widom, J. (2006). The cql continuous query language: Semantic foundations and query execution. The VLDB Journal, 15(2), 121 142.

Artikis, A., Sergot, M. J., & Paliouras, G. (2015). An event calculus for event recognition. IEEE Trans. Knowl. Data Eng., 27(4), 895 908.

Barbieri, D. F., Braga, D., Ceri, S., Della Valle, E., & Grossniklaus, M. (2010). Incremental reasoning on streams and rich background knowledge. In Aroyo, L., Antoniou, G., Hyv onen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., & Tudorache, T. (Eds.), The Semantic Web: Research and Applications, pp. 1 15, Berlin, Heidelberg. Springer Berlin Heidelberg.

Bassiouni, M., Llewellyn, M., & Mukherjee, A. (1993). Time-based operators for relational algebra query languages. Computer Languages, 19(4), 261 276.

Beck, H., Dao-Tran, M., & Eiter, T. (2018). LARS: A logic-based framework for analytic reasoning over streams. Artif. Intell., 261, 16 70.

Brandt, S., Kalayci, E. G., Ryzhikov, V., Xiao, G., & Zakharyaschev, M. (2018). Querying log data with metric temporal logic. Journal of Artiﬁcial Intelligence Research, 62(1), 829 877.

Camossi, E., Jousselme, A.-L., Ray, C., Hadzagic, M., Dreo, R., & Claramunt, C. (2017). Maritime experiments speciﬁcation, h2020 datacron project deliverable d5.3.. http://datacron-project.eu/.

Ceri, S., & Widom, J. (1991). Deriving production rules for incremental view maintenance. In Proceedings of the 17th International Conference on Very Large Data Bases, VLDB 91, pp. 577 589, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.

Cervesato, I., & Montanari, A. (2000). A calculus of macro-events: Progress report. In Seventh International Workshop on Temporal Representation and Reasoning, TIME 2000, Nova Scotia, Canada, July 7-9, 2000, pp. 47 58.

Chittaro, L., & Montanari, A. (1996). Eﬃcient temporal reasoning in the cached event calculus. Computational Intelligence, 12(3), 359 382.

Clark, K. L. (1977). Negation as failure. In Logic and Data Bases, Symposium on Logic and Data Bases, Centre d etudes et de recherches de Toulouse, France, 1977, pp. 293 322.

Tsilionis, Artikis, & Paliouras

Cugola, G., & Margara, A. (2010). TESLA: a formally deﬁned event speciﬁcation language. In Proceedings of the Fourth ACM International Conference on Distributed Event Based Systems, DEBS 2010, Cambridge, United Kingdom, July 12-15, 2010, pp. 50 61.

Cugola, G., & Margara, A. (2012). Complex event processing with t-rex. Journal of Systems and Software, 85(8), 1709 1728.

Demers, A. J., Gehrke, J., Panda, B., Riedewald, M., Sharma, V., & White, W. M. (2007). Cayuga: A general purpose event monitoring system. In CIDR 2007, Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 7-10, 2007, Online Proceedings, pp. 412 422.

Dindar, N., Fischer, P. M., Soner, M., & Tatbul, N. (2011). Eﬃciently correlating complex events over live and archived data streams. In Proceedings of the 5th ACM International Conference on Distributed Event-based System, DEBS 11, pp. 243 254, New York, NY, USA. ACM.

Dong, G., & Su, J. (1994). First-order incremental evaluation of datalog queries. In Beeri, C., Ohori, A., & Shasha, D. E. (Eds.), Database Programming Languages (DBPL-4), pp. 295 308, London. Springer London.

Dong, G., Su, J., & Topor, R. (1995). Nonrecursive incremental evaluation of datalog queries. Annals of Mathematics and Artiﬁcial Intelligence, 14(2), 187 223.

Dong, G., & Topor, R. (1992). Incremental evaluation of datalog queries. In Biskup, J., & Hull, R. (Eds.), Database Theory ICDT 92, pp. 282 296, Berlin, Heidelberg. Springer Berlin Heidelberg.

Dong, G., & Su, J. (1995). Incremental and decremental evaluation of transitive closure by ﬁrst-order queries. Information and Computation, 120(1), 101 106.

Dousson, C., & Maigat, P. L. (2007). Chronicle recognition improvement using temporal focusing and hierarchization. In IJCAI 2007, Proceedings of the 20th International Joint Conference on Artiﬁcial Intelligence, Hyderabad, India, January 6-12, 2007, pp. 324 329.

Fern, A., Givan, R., & Siskind, J. M. (2002). Speciﬁc-to-general learning for temporal events with application to learning event deﬁnitions from video. Journal of Artiﬁcial Intelligence Research, 17(1), 379 449.

Giatrakos, N., Alevizos, E., Artikis, A., Deligiannakis, A., & Garofalakis, M. N. (2020). Complex event recognition in the big data era: a survey. VLDB J., 29(1), 313 352.

Grez, A., Riveros, C., & Ugarte, M. (2019). A formal framework for complex event processing. In 22nd International Conference on Database Theory, ICDT 2019, March 26-28, 2019, Lisbon, Portugal, pp. 5:1 5:18.

Grez, A., Riveros, C., Ugarte, M., & Vansummeren, S. (2020). On the Expressiveness of Languages for Complex Event Recognition. In Lutz, C., & Jung, J. C. (Eds.), 23rd International Conference on Database Theory (ICDT 2020), Vol. 155 of Leibniz International Proceedings in Informatics (LIPIcs), pp. 15:1 15:17, Dagstuhl, Germany. Schloss Dagstuhl Leibniz-Zentrum f ur Informatik.

Incremental Event Calculus for Run-Time Reasoning

Gupta, A., Mumick, I. S., & Subrahmanian, V. S. (1993). Maintaining views incrementally. SIGMOD Rec., 22(2), 157 166.

Gyllstrom, D., Wu, E., Chae, H., Diao, Y., Stahlberg, P., & Anderson, G. (2006). SASE: complex event processing over streams. Co RR, abs/cs/0612128.

Hu, P., Motik, B., & Horrocks, I. (2018). Optimised maintenance of datalog materialisations. In Proceedings of the Thirty-Second AAAI Conference on Artiﬁcial Intelligence, (AAAI-18), the 30th innovative Applications of Artiﬁcial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artiﬁcial Intelligence (EAAI18), New Orleans, Louisiana, USA, February 2-7, 2018, pp. 1871 1879.

Hu, P., Motik, B., & Horrocks, I. (2019). Modular materialisation of datalog programs. In The Thirty-Third AAAI Conference on Artiﬁcial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artiﬁcial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artiﬁcial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pp. 2859 2866.

Idris, M., Ugarte, M., Vansummeren, S., Voigt, H., & Lehner, W. (2019). Eﬃcient query processing for dynamically changing datasets. SIGMOD Rec., 48(1), 33 40.

Koch, C., Ahmad, Y., Kennedy, O., Nikolic, M., N otzli, A., Lupei, D., & Shaikhha, A. (2014). Dbtoaster: higher-order delta processing for dynamic, frequently fresh views. The VLDB Journal, 23(2), 253 278.

Kowalski, R. A., & Sergot, M. J. (1986). A logic-based calculus of events. New Generation Comput., 4(1), 67 95.

Lee, J., & Palla, R. (2012). Reformulating the situation calculus and the event calculus in the general theory of stable models and in answer set programming. Journal of Artiﬁcial Intelligence Research, 43, 571 620.

Li, M., Mani, M., Rundensteiner, E. A., & Lin, T. (2011). Complex event pattern detection over streams with interval-based temporal semantics. In Proceedings of the 5th ACM International Conference on Distributed Event-based System, DEBS 11, pp. 291 302, New York, NY, USA. ACM.

Liu, M., Rundensteiner, E. A., Greenﬁeld, K., Gupta, C., Wang, S., Ari, I., & Mehta, A. (2011). E-cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011, pp. 889 900.

Mei, Y., & Madden, S. (2009). Zstream: a cost-based query processor for adaptively detecting composite events. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29 - July 2, 2009, pp. 193 206.

Miller, R., & Shanahan, M. (2002). Some alternative formulations of the event calculus. In Computational Logic: Logic Programming and Beyond, Essays in Honour of Robert A. Kowalski, Part II, pp. 452 490.

Tsilionis, Artikis, & Paliouras

Montali, M., Maggi, F. M., Chesani, F., Mello, P., & van der Aalst, W. M. P. (2013). Monitoring business constraints with the event calculus. ACM TIST, 5(1), 17:1 17:30.

Motik, B., Nenov, Y., Piro, R., & Horrocks, I. (2015). Incremental update of datalog materialisation: The backward/forward algorithm. In Proceedings of the Twenty-Ninth AAAI Conference on Artiﬁcial Intelligence, AAAI 15, pp. 1560 1568. AAAI Press.

Motik, B., Nenov, Y., Piro, R., & Horrocks, I. (2019). Maintenance of datalog materialisations revisited. Artif. Intell., 269, 76 136.

Mutschler, C., & Philippsen, M. (2014). Adaptive speculative processing of out-of-order event streams. ACM Trans. Internet Techn., 14(1), 4:1 4:24.

Paschke, A. (2006). Eca-ruleml: An approach combining ECA rules with temporal intervalbased KR event/action logics and transactional update logics. Co RR, abs/cs/0610167.

Paschke, A., & Bichler, M. (2008). Knowledge representation concepts for automated SLA management. Decision Support Systems, 46(1), 187 205.

Patroumpas, K., Alevizos, E., Artikis, A., Vodas, M., Pelekis, N., & Theodoridis, Y. (2017). Online event recognition from moving vessel trajectories. Geo Informatica, 21(2), 389 427.

Pitsikalis, M., Artikis, A., Dreo, R., Ray, C., Camossi, E., & Jousselme, A.-L. (2019). Composite event recognition for maritime monitoring. In Proceedings of the 13th ACM International Conference on Distributed and Event-Based Systems, DEBS 19, p. 163 174, New York, NY, USA. Association for Computing Machinery.

Przymusinski, T. (1987). On the declarate semantics of stratiﬁed deductive databases and logic programs. In Foundations of Deductive Databases and Logic Programming. Morgan.

Ray, C., Dr eo, R., Camossi, E., Jousselme, A.-L., & Iphar, C. (2019). Heterogeneous integrated dataset for maritime intelligence, surveillance, and reconnaissance. https://zenodo.org/record/1167595.

Ray, C., Camossi, E., Jousselme, A.-L., Hadzagic, M., Claramunt, C., & Batty, E. (2016). Maritime data preparation and curation, h2020 datacron project deliverable d5.2.. http://datacron-project.eu/.

Ronca, A., Kaminski, M., Grau, B. C., Motik, B., & Horrocks, I. (2018). Stream reasoning in temporal datalog. In Proceedings of the Thirty-Second AAAI Conference on Artiﬁcial Intelligence, (AAAI-18), the 30th innovative Applications of Artiﬁcial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artiﬁcial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pp. 1941 1948.

Santipantakis, G. M., Vlachou, A., Doulkeridis, C., Artikis, A., Kontopoulos, I., & Vouros, G. A. (2018). A stream reasoning system for maritime monitoring. In 25th International Symposium on Temporal Representation and Reasoning, TIME 2018, Warsaw, Poland, October 15-17, 2018, pp. 20:1 20:17.

Incremental Event Calculus for Run-Time Reasoning

Schultz-Møller, N. P., Migliavacca, M., & Pietzuch, P. R. (2009). Distributed complex event processing with query rewriting. In Proceedings of the Third ACM International Conference on Distributed Event-Based Systems, DEBS 2009, Nashville, Tennessee, USA, July 6-9, 2009.

Tsilionis, E., Artikis, A., & Paliouras, G. (2019a). Incremental event calculus for run-time reasoning. In Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, DEBS 19, pp. 79 90, New York, NY, USA. ACM.

Tsilionis, E., Koutroumanis, N., Nikitopoulos, P., Doulkeridis, C., & Artikis, A. (2019b). Online event recognition from moving vehicles: Application paper. Theory Pract. Log. Program., 19(5-6), 841 856.

Zhang, H., Diao, Y., & Immerman, N. (2014). On complexity and optimization of expensive queries in complex event processing. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, pp. 217 228.