# situation_calculus_semantics_for_actual_causality__bdbda7d0.pdf

Situation Calculus Semantics for Actual Causality

Vitaliy Batusov York University Toronto, Canada vbatusov@cse.yorku.ca

Mikhail Soutchanski Ryerson University Toronto, Canada http://www.scs.ryerson.ca/mes

The deﬁnitions of actual cause given by Pearl and Halpern (HP) in the framework of causal models provided vital computational insight into an old philosophical problem but by no means resolved it. One source of concern is the lack of objective criteria for selecting possible worlds to be admitted into the counterfactual analysis, epitomized by the competition between multiple proposals by HP and others. Another concern is due to the modest expressivity of propositionallevel structural equations which limits their applicability and, arguably, contributes to the the former problem. We tackle both of these issues using a novel approach. We build our deﬁnition of actual cause from ﬁrst principles in the context of atemporal situation calculus (SC) action theories with sequential actions. As a result, we can successfully identify actual causes of conditions expressed in ﬁrst-order logic. We validate the HP approach by providing a formal translation from causal models to SC and proving a relationship between our deﬁnitions of actual cause and that of HP. Using wellknown and new examples, we show that long-standing disagreements between alternative deﬁnitions of actual causality can be mitigated by faithful SC modelling of the domains.

1 Introduction Actual causality, also known as token-level causality, is concerned with ﬁnding in a given scenario a singular event that caused another event. This is in contrast to type-level causality which is concerned with universal causal mechanisms governing the world. The leading line of computational enquiry into actual causality was pioneered by (Pearl 1998; 2000) and continued by (Halpern and Pearl 2005; Halpern 2000; Eiter and Lukasiewicz 2002; Hopkins 2005; Halpern 2015; 2016) and in other publications. We call it the HP approach. It is based on the concept of structural equations (Simon 1953; 1977) and implemented in the framework of causal models. The HP approach follows the Humean counterfactual deﬁnition of causation, which posits that saying an event A caused an outcome B is the same as saying if A had not been, then B never had existed . This deﬁnition is well-known to suffer from the problem of preemption: it could be the case that in the absence of event A, B would still have occurred due to another event, which in the original scenario was preempted by A. HP address this by performing counterfactual analysis only under carefully selected contingencies which suspend some subset

Copyright c 2018, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

of the model s mechanisms. Selecting proper contingencies proved to be a challenging task (Halpern 2015). As mentioned in (Halpern 2016) on p.27, The jury is still out on what the right deﬁnition of causality is . The HP approach is prone to producing results that cannot be reconciled with intuitive understanding due to the limited expressiveness of causal models (Hopkins 2005; Hopkins and Pearl 2007). The ontological commitments of structural causal models resemble propositional logic, they have no objects, no relationships, no time, no support for quantiﬁed causal queries. Thus, causal models are too coarse to distinguish between enduring conditions and transitional events, providing only atomic propositions to model both. Moreover, causal models represent presence and absence of an event identically by assigning a value to a propositional variable. Both of these deﬁciencies stem from the lack of a mechanism for modelling change over time. Since counterfactual theories of actual causality based on structural equations share the same ailments (Menzies 2014; Glymour et al. 2010), it seems natural to explore actual causality from a different perspective. We do this in the language of the situation calculus under the classical Tarskian semantics, where the notion of a cause naturally aligns with the notion of an action, and the effect can be speciﬁed by a FOL formula with quantiﬁers over object variables. In contrast to HP whose analysis is based on observing the end results of interventions, we do so by analyzing the dynamics which lead to the end results. We start with a brief introduction to situation calculus (SC) in Section 2, and to HP s causal models in Section 3. In Section 4 we propose our new deﬁnition of an achievement cause. We investigate the formal relation between our deﬁnition and the recent HP s ( modiﬁed ) deﬁnition of an actual cause in Section 5. Finally, in Section 6 we discuss related work, and then conclude in Section 7.

2 Situation Calculus

SC is proposed in (Mc Carthy and Hayes 1969) and elaborated in (Reiter 2001). In the Reiter s SC, the constant S0 denotes the initial situation that represents an empty list of actions, while the complex situation term do([α1, ...., αn], S0) represents the situation that results from executing actions α1, ..., αn consecutively so that α1 is executed in S0, and αn is executed last. If none of the action terms αi have variables, then we call this situation term an (actual) narrative. An action term αi may occur in the narrative

The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)

more than once at different positions. The set of all situations can be visualized as a tree with a partial-order relation s1 s2 on situations s1, s2, and s1 s2 abbreviates s1 s2 s1 = s2. It is characterized by the foundational domain-independent axioms (Σ) included in a basic action theory (BAT) D that also includes axioms DS0 describing the initial situation, and action precondition axioms Dap using the predicate Poss(a, s) to say when an action a is possible in s. For each action function there is one precondition axiom Poss(A( x), s) ΠA( x, s), where all free variables are implicitly -quantiﬁed, and Π( x, s) is a formula uniform in s, meaning that it has no occurrences of Poss, , no other situation terms, no quantiﬁers over situations. For each ﬂuent F, D includes a successor state axiom (SSA)

F( x, do(a, s)) ψ+( x, a, s) F( x, s) ψ ( x, a, s)),

where the ﬂuent predicate F( x, s) represents a situationdependent relation over a tuple of objects x, uniform formulas ψ+( x, a, s) and ψ ( x, a, s) specify action terms that under certain application-dependent conditions have a positive effect (make F true), or a negative effect on ﬂuent F (make it false), respectively. The SSAs are derived under the causal completeness assumption (Reiter 1991) that all effects of actions on ﬂuents are explicitly represented. There are a number of auxiliary axioms, such as unique name axioms, that are included in D. The abbreviation executable(s) means that each action mentioned in the situation term s was possible in the situation in which it was executed:

executable(s) def= a s (do(a, s ) s Poss(a, s )).

In SC, the formulas s ψ(s), where ψ(s) is uniform in s, are called state constraints since they represent conditions true in every state (Lin and Reiter 1994). Without loss of generality, we would assume that any given BAT D entails all state constraints, i.e., they are compiled into D. The basic computational challenge, called the projection problem, is the task of establishing whether a BAT entails, for an executable ground situation term σ, that a query sentence φ(σ) holds, where φ(σ) is a formula uniform in σ. This problem can be solved using the one-step regression operator ρ. The expression ρ[ϕ(s), α] denotes the formula obtained from ϕ(s) by replacing each ﬂuent atom F( t, s) that occurs in ϕ with the right-hand side of the SSA for F, where an action variable a is replaced with the ground action term α, and then the resulting formula is simpliﬁed using unique name axioms for actions and constants. Similarly to the theorem about multistep regression R in (Reiter 1991), one can prove that, given a BAT D, a formula ϕ(s) uniform in s, and a ground action term α, we have that D |= ϕ(do(α, s)) ρ[ϕ(s), α].

3 The Halpern-Pearl Approach Halpern and Pearl (2005), following the motivation of (Lewis 1974), base their formal account of actual causality on the notion of a counterfactual a conditional statement whose premise is contrary to fact. They construct counterfactual statements in a formal language whose semantics is deﬁned relative to a causal setting (see below). A causal model M is a tuple U, V, R, F , where U and V are disjoint

sets of exogenous and endogenous variables, respectively, with each variable taking various values from an underlying domain. The function R maps every variable Z U V to a non-empty set R(Z) of possible values. F is a set of total functions {FX : Z U V\{X}R(Z) R(X) | X V} which act like structural equations; each ﬁnite tuple of values assigned to the variables (excluding X) maps to a single value of X. Intuitively, for each endogenous variable X, FX encodes the entirety of causal laws which determine X by mapping every value assignment on all variables except X to some value of X. The values of exogenous variables U are set externally; a tuple VU of values for U is called a context of M, and the pair (M, VU) constitutes a causal setting. The tuple U, V, R is called the signature of M. The set of functions F determines a partial dependency order X Y on endogenous variables X, Y . Namely, Y depends on X, X Y , if either X affects Y directly by virtue of FY , or indirectly via intermediate functions. It is subsequently assumed that a given causal model is acyclic, that is, for each context VU of M, there is a partial order on V that is anti-symmetric, reﬂexive and transitive. This assumption guarantees the existence of a unique solution to the equations F. The language of the HP approach is as follows. A primitive event is a formula X = VX where X V and VX R(X). We call a Boolean combination of primitive events a HP query. A general causal formula is one of the form [Y1 VY1, . . . , Yk VYk]φ where φ is a HP query, Yi for 1 i k are distinct variables from V, and VYi R(Yi). (We abbreviate [Y1 VY1, . . . , Yk VYk] as [ Y VY ] and call it intervention.). A primitive event X =VX is satisﬁed in a causal setting (M, VU), denoted (M, VU) |= (X =VX), if X takes on the value VX in the unique solution to the equations F once U are set to VU. HP queries are interpreted following the usual rules for Boolean connectives. Finally, (M, VU) |= [Y1 VY1, . . . , Yk VYk]φ iff (M , VU) |= φ where M is obtained from M by replacing each FYi F by the trivial function FYi : Z U V\{X}R(Z) VYi that ﬁxes Yi to a constant VYi for all the values of arguments. Since M is acyclic, M remains acyclic too. In this paper, we focus on the so-called modiﬁed HP definition, or HPm, of actual cause (Hopkins 2005; Halpern 2015; 2016) because it is the most recent, intuitively appealing, and thoroughly connected with older deﬁnitions by formal results in (Halpern 2016). According to this deﬁnition, the conjunction of primitive events X = VX (short for X1 =VX1 . . . Xk =VXk) is an actual cause in (M, VU) of a HP query φ if all following conditions hold: 1. (M, VU) |= ( X = VX) and (M, VU) |= φ. 2. There exists a set W (disjoint from X) of variables in V with (M, VU) |= ( W = VW ) and a setting V X of variables X such that (M, VU) |= [ X V X, W VW ] φ. 3. No proper sub-conjunction of ( X= VX) satisﬁes 1, 2. Notice that in Item 2, according to (M, VU) |= ( W = VW ), interventions that set variables in X to counterfactual values V X have to set all variables in W to their actual values VW in the actual context. The tuple W, VW , V X is called a witness to the fact that ( X = VX) is a cause of φ.

Example 1. Consider the two well-known Forest Fire examples from (Halpern and Pearl 2005; Halpern 2016). Both have the same set of endogenous variables: MD (match dropped by arsonist), L (lightning strike), FF (forest is on ﬁre). In both cases, MD and L are set to true by the context. The model Md for the disjunctive scenario has it that either one of the events (MD = true), (L = true) is sufﬁcient to start a ﬁre, so the equation for FF becomes FF := (MD = true) (L = true). The model Mc for the conjunctive scenario requires both events in order to create a forest ﬁre, so FF := (MD=true) (L=true). By HPm, neither (MD=true) nor (L=true) are singleton actual causes in Md because it is impossible to fulﬁll part 2 of the deﬁnition above by setting either variable to false, but the conjunction (MD = true) (L = true) is deemed an actual cause. In contrast, in Mc, both (MD = true) and (L = true) are singleton actual causes because setting one of {MD, L} to false makes the forest ﬁre impossible, but their conjunction is not an actual cause because it violates minimality (condition 3).

HPm is an improvement over the original proposal by (Halpern and Pearl 2005), in part, because it is able to differentiate between such conjunctive and disjunctive scenarios.

4 Proposal: The Achievement Causal Chain

We propose to axiomatize a dynamic world using a situation calculus theory and derive actual causality from ﬁrst principles. Speciﬁcally, to represent a scenario , we consider a BAT D and an accompanying narrative describing the actions/events which transpired in the world characterized by D. We do not formally distinguish between agent actions and nature s events. The narrative is speciﬁed by an executable ground situation term σ called the actual situation . An effect for which we seek to identify causes is given by a formula ϕ(s) uniform in situation s. Since actions are the sole source of change in a BAT, we identify the set of potential causes of an effect ϕ with the set of all ground action terms occurring in σ. To formally capture a scenario, we, like HP, introduce the notion of a causal setting.

Deﬁnition 1. A (SC) causal setting is a triple D, σ, ϕ(s) where D is a BAT, σ is a ground situation term such that D |= executable(σ), and ϕ(s) is a situation calculus formula uniform in s such that D |= s(executable(s) ϕ(s)).

Since the BAT D is ﬁxed in our approach, we typically refer to D, σ, ϕ(s) as just σ, ϕ(s) . Intuition provides few deﬁnite truths about actual causality, but we hold the following to be self-evident: If some action α of the action sequence σ triggers the formula ϕ(s) to change its truth value from false to true relative to D and if there is no action in σ after α that changes the value of ϕ(s) back to false, then α is an actual cause of achieving ϕ(s) in σ. This statement is sound because: (a) the narrative σ determines a total linear order on its actions, (b) change is associated with a particular element of that order, and (c) no change comes about other than by an action of σ. The next deﬁnition states this observation formally.

Deﬁnition 2. A causal setting C = σ, ϕ(s) satisﬁes the achievement condition via the situation term do(α, σ ) σ iff D |= ϕ(σ ) s (do(α, σ ) s σ ϕ(s)). Whenever a causal setting C satisﬁes the achievement condition via do(α, σ ), we say that the (ground) action α executed in σ is a (primary) achievement cause in C.

If a causal setting does not satisfy the achievement condition and ϕ(s) is non-tautological and holds throughout the narrative σ, then we ascribe the achievement of ϕ(s) to an unknowable cause masked by the initial situation S0. If ϕ(s) is a tautology, it legitimately has no cause. If ϕ(σ) is not entailed by D, meaning that ϕ(s) is not achieved by the end of the narrative, then its achievement cause truly does not exist.

Example 1 (cont.). We axiomatize the conjunctive Forest Fire example in a straight-forward way. Let MD(s), L(s), FF(s) be ﬂuents; let md, l be the agent s actions which affect the respective ﬂuents, and let ff be a (natural) event triggered by the previous actions.

Poss(md, s), Poss(l, s), Poss(ff, s) MD(s) L(s), MD(do(a, s)) a=md MD(s), L(do(a, s)) a=l L(s), FF(do(a, s)) a=ff FF(s).

The story does not specify a temporal order between md and l, so w.l.o.g. let us ﬁx a narrative where the match is dropped before the lightning strike: σ =do([md, l, ff], S0). The causal setting σ, FF(s) satisﬁes the achievement condition via the event ff, so ff executed in do([md, l], S0) is an achievement cause. This is obviously true, but not very useful. To ﬁnd the root cause we need a deeper analysis.

The notion of the achievement condition mentioned before forms our basic tool which, when used together with the single-step regression operator ρ, helps us not only ﬁnd the single action that brings about the effect of interest, but also identify the actions that build up to it. Intuitively, ρ[ϕ(s), α] is the weakest precondition that must hold in a previous situation σ in order for ϕ(s) to hold after performing α in σ . If we prove α to be an achievement cause of ϕ(s) in do(α, σ ), we can use single-step regression ρ to obtain a formula that holds at σ and constitutes a necessary and sufﬁcient condition for the achievement of ϕ(s) via α. This new formula may have an achievement cause of its own which, by virtue of α, also constructively contributes to the achievement of ϕ(s). By repeating this process, we can uncover the entire chain of actions that incrementally build up to the achievement of the ultimate effect. At the same time, we must not overlook the condition which makes the execution of α in σ even possible. This condition is conveniently captured by the right-hand side of the precondition axiom for α and may have achievement causes of its own. To sum up: if α is an achievement cause of ϕ(s) in do(α, σ ), then ρ[ϕ(s), α] and the precondition Πα(s) of α, taken together, express the condition which (a) holds at σ , (b) is necessary and sufﬁcient for executing α in σ , and (c) is necessary and sufﬁcient for achieving ϕ(s) via α. The following inductive deﬁnition formalizes this intuition.

Deﬁnition 3. If a causal setting C = σ, ϕ(s) satisﬁes the achievement condition via some situation term do(A( t), σ ) σ and α is an achievement cause in the causal setting σ , ρ[ϕ(s), A( t)] ΠA( t, s) , then α is an achievement cause in C. Clearly, the process of discovering intermediary achievement causes using single-step regression repeatedly cannot continue beyond S0. Since the given narrative σ is a ﬁnite sequence, the achievement causes of C also form a ﬁnite sequence which we call the achievement causal chain of C. Note that the actions of the achievement causal chain need not be adjacent in the action sequence of σ. In fact, in σ they can be interspersed with other actions irrelevant to the achievement of ϕ. Example 1 (cont.). Computing ρ[FF(s), ff] Poss(ff, s) by Deﬁnition 3 gives rise to a new causal setting do([md, l], S0), MD(s) L(s) . This setting satisﬁes the achievement condition via the action l, so l executed in do([md], S0) is an achievement cause. This yields yet another setting do([md], S0), MD(s) which meets the achievement condition via md, and the analysis terminates. Thus, in this example, all actions of σ constitute a causal chain leading up to FF(s). Observe that our choice of the narrative do([md, l, ff], S0) over the other possibility do([l, md, ff], S0) does not affect the conclusion: in the alternative narrative, all actions are also deemed causes. To model disjunctive Forest Fire, we replace the precondition axiom for ff by Poss(ff, s) MD(s) L(s). Like before, the causal setting do([md, l, ff], S0), FF(s) has an achievement cause ff and generates another setting do([md, l], S0), MD(s) L(s) . In contrast to the conjunctive case, however, this new setting has md as an achievement cause, and the analysis terminates at S0. The complete causal chain here consists of md, ff. The lightning strike is overlooked because the match was sufﬁcient for starting a ﬁre and occurred ﬁrst. This may seem like a limitation of our approach, but consider the alternative narrative do([l, md, ff], S0): there, the causal chain is l, ff, so we are able to identify all causally relevant events by considering all possible narratives that ﬁt the story. This example also illustrates just how well our approach handles preemption: if the story had stipulated that the match was dropped before the lightning strike, we would automatically discount the lightning strike as a cause without having to construct elaborate contingencies along the lines of HP. Note that our axiomatization of Forest Fire contains a triggered event ff. To provide SC semantics to such events, we adhere to Reiter s notion of natural actions (Reiter 2001) which modiﬁes executable(s) to force natural actions to occur as soon as their preconditions are realized.

5 Formal Relationship with HP In order to prove a formal relationship between Deﬁnition 3 and HPm, we ﬁrst need to establish a common ground between the two formalisms. We choose to do so by reformulating causal models in situation calculus. Let (M, VU) be a HP causal setting where M = U, V, R, F is an acyclic causal model and VU a context. We assume

that U, V, and the range of R are ﬁnite sets and there are no collisions between constants for variable and value symbols. We construct a BAT D from (M, VU) as follows. We treat U, V, and R(X) for all X U V as sets of SC constant symbols for which we introduce unique name axioms. If S = {C1, . . . , Cn} is a set of constants and y is a SC object term, the expression y S denotes (y = C1 . . . y=Cn). If X U V with R(X)={V1, . . . , Vn}, y R(X) denotes (y = V1 . . . y = Vn). To represent functions F, we introduce a situation-independent relational symbol f with arity 1+|U V|+1 where the ﬁrst argument is the name of the variable (X) which FX F determines, the last argument is the value which FX assigns to X, and the arguments in between are the values of variables U V arranged in some predetermined order. The actions of D are get(x, v), meaning compute the value of the endogenous variable x using Fx F, and set(x, v), meaning ignore Fx and force the value v upon x. The only ﬂuent of D is the relational ﬂuent V (x, v, s) stating that v is the value of the endogenous variable x in situation s. Let Det(x, v, s) be an abbreviation for

v1 . . . v N .

1 i N y y = Zi vi R(Zi)

v (V (y, v , s) vi = v ) f(x, v1, . . . , v N, v),

where U V = {Z1, . . . , ZN}, meaning that the value of variable x is determined in s to be v whenever the values vi which exist in s, when bound to appropriate arguments of f, unequivocally assign v to x. This means, crucially, that x may be determined as soon as some but not necessarily all of the variables on which it depends (as per ) have acquired values. The axioms of D are: X V v(V (X, v, S0)), VY VU v(V (Y, v, S0)) v(V (Y, v, S0) v=VY )),

Poss(set(x, v), s)

X V(x=X v R(X)) v V (x, v , s),

Poss(get(x, v), s)

x V v V (x, v , s) Det(x, v, s), V (x, v, do(a, s)) a=get(x, v) a=set(x, v) V (x, v, s).

In words, none of the endogenous variables have values at S0, and all exogenous variables have values at S0 as speciﬁed by the context. It is possible to force a value v upon x as long as x is an endogenous variable, v is in the range of x, and x has not yet acquired a value. It is possible to compute the value of x as long as x is an endogenous variable which has not yet acquired a value but which is destined at s to get the value v. Since preconditions disallow value reassignment, the SSA has no negative effects. Overall, the theory models all possible propagations of values throughout the set of variables according to the structural equations, as well as all propagations of values under interventions when some of the variables are forced to speciﬁed values. As we are interested only in those situations where all variables have acquired values, which represent a unique solution to F (possibly after interventions),

we introduce the abbreviation terminal(s) for the expression executable(s) a(Poss(a, s)). In order to refer to situations under speciﬁc interventions, we use the abbreviation schema interv Y1 VY1,...,Yk VYk (s) which stands for terminal(s) x v.[ s (do(set(x, v), s ) s)

1 i k(x = Yi v = VYi)]. The special case interv (s) describes s under the empty intervention. Notice that in any situation term S that satisﬁes interv Y1 VY1,...,Yk VYk (s) all actions are executable, but since S is terminal, no further actions are possible, S mentions an action set(Y, VY ) for every (Y VY ), and all other actions in S are get. Finally, given a HP query φ, we obtain a corresponding SC query ˆφ from φ by replacing each primitive event (X = VX) by V (X, VX, s). Notice that ˆφ is ground in all object arguments and uniform in s. This completes the translation. We can now prove the correctness of our axiomatization relative to a HP causal setting.

Theorem 1. Let (M, VU) be a HP causal setting, [ Y VY ]φ an arbitrary causal formula over M, and D a BAT obtained from (M, VU). Then (M, VU) |= [ Y VY ]φ iff D |= s(interv Y VY (s)) s(interv Y VY (s) ˆφ(s)).

Proof. (Sketch) The proof is straightforward by induction on the structure of φ. We prove the base case, when φ is a primitive event, by induction on the length of the situation. By construction, for every U U, we have D |= V (U, VU, S0) such that VU VU, i.e. all exogenous variables have unique, correct values at S0 (by correct we mean a value which agrees with the causal model). Also by construction, none of the endogenous variables have values at S0, i.e. they are not incorrect. From this base case, we can always construct an arbitrary narrative σ which conforms to the intervention [Y1 y1, . . . , Yk yk] and show that if it does, then it produces only correct values. We make the inductive assumption that there exists a sub-sequence σ of σ such that all values that exist at σ are correct. Recall that σ is not terminal whenever some subset V of the endogenous variables have not yet acquired values, i.e. D |= X V v V (X, v, σ ). Formally, the inductive assumption states that for every X V \ V and every VX R(X), D |= V (X, VX, σ ) if and only if (M, VU) |= [Y1 y1, . . . , Yk yk](X = VX). If V is empty, then σ = σ and our claim follows immediately. Otherwise, we arbitrarily select the next action among the following two options. Option 1: If one of the variables intervened upon does not have a value at σ , i.e. Yi V for some 1 i k, then set(Yi, VYi) is possible and trivially creates a correct value. Option 2: If there exists some Z V which is not among Y1, . . . , Yk and D |= Det(Z, VZ, σ ) for some VZ, then the action get(Z, VZ) is possible. By the deﬁnition of Det, VZ is obtained from the values which exist at σ (which we assumed are correct) using the structural equation for Z (which is the same for both BAT and causal model), so the computed value VZ for Z must agree with the causal model. Since the original causal model has a unique solution, because it is acyclic, if σ = σ, then at least one of these options is always applicable.

With this result, we can easily translate HPm to the language of SC and formally compare the two approaches. Theorem 2. Let (M, VU) be a HP causal setting and φ a HP query over M. Let D be a BAT obtained from (M, VU). Let X V and VX R(X).

1. (X =VX) is a singleton cause of φ in (M, VU) according to HPm if and only if get(X, VX) σ appears in the achievement causal chain of σ, ˆφ(s) for every ground situation term σ of D such that D |= interv (σ). 2. (X =VX) is a part of a cause of φ in (M, VU) according to HPm if and only if there exists a ground situation term σ of D such that D |= interv (σ) and get(X, VX) σ appears in the achievement causal chain of σ, ˆφ(s) .

Proof. We sketch the proof of Part 1 only; Part 2 is similar but more involved. As before, we prove the case where φ is a primitive event and leave the rest to induction on the structure of φ. In the sketch below, for the sake of simplicity, when we talk about single-step regression of formulas that include Poss(get(X, VX), s) as one of the conjuncts, we omit the terms related to v V (X, v , s) since they are determined by the initial theory, and keep only Det(X, VX, s). ( .) Suppose (X = VX) is a singleton cause of φ in (M, VU) according to HPm with a witness ( W, VW , V X). Take an arbitrary ground situation term σ of the BAT D, obtained from (M, VU), such that D |= interv (σ). Let σ be a terminal ground situation which coincides with σ up to and excluding get(X, VX), contains set(X, V X) in its place, and contains set(W, VW ) in the places of get(W, VW ) for all W W. By the deﬁnition of a witness, (M, VU) |= [X V X, W VW ] φ. By Theorem 1, D |= ˆφ(σ) and D |= ˆφ(σ ). By construction, all ﬂuent values in σ, σ agree up to the action get(X, VX)/set(X, V X), so the divergence of the values is accounted for either by V X or by some subsequent divergent value. We can show that the difference between VX and V X can explain the divergence of the values. More formally, let φ be Z = VZ; recall, we assume that effect φ is a primitive event. Since (M, VU) |= [X V X, W VW ] (Z = VZ), then (Z = V Z) must hold under the same intervention for some V Z = VZ. The primary achievement cause in σ, V (Z, VZ, s) is the action get(Z, VZ), which yields a new setting σ , Det(Z, VZ, s) . Since Z acquires a different value in σ , the achievement cause of the new causal setting must occur in σ no earlier than get(X, VX). If it is get(X, VX), we are ﬁnished. Otherwise, the same argument applies: we locate the achievement cause, some action get(Y, VY ) occurring no earlier than get(X, VX), with D |= V (Y, V Y , σ ), V Y = VY , and generate a new causal setting. Since σ is ﬁnite, the analysis converges to the case where the only possible cause is get(X, VX). ( .) For an arbitrary σ = do([get(Z1, VZ1), . . ., get(Zn, VZn)], S0) and an arbitrary Zk (k n) such that D |= interv (σ) V (Zk, VZk, σ), the primary achievement cause of σ, V (Zk, VZk, s) is always the action get(Zk, VZk) executed in some σ σ. The remainder of the causal chain is discovered recursively through the new causal setting σ , Det(Zk, VZk, s) . The remainder

is empty only if Zk is determined to be VZk from the outset by the context. Otherwise, σ , Det(Zk, VZk, s) has an achievement cause of its own. Since σ represents an empty intervention, this secondary cause is get(Zm, VZm) for some m < k. Observe that, by the deﬁnition of Det, get(Zm, VZm) is a cause precisely because the act of setting Zm to VZm removes any FZk-borne ambiguity as to the value of Zk. In other words, prior to get(Zm, VZm), the variable Zk could attain any of at least two possible values, but get(Zm, VZm) constrained it to eventually attain VZk. Therefore, there exists a (counterfactual) value V Zm R(Zm), V Zm = VZm which, if substituted for VZm, would lead to the value of Zk being different from VZk. Let σ be an alternative terminal situation which has the same actions as σ up to and excluding get(Zm, VZm); the latter is replaced in σ by set(Zm, V Zm), and all actions get(Zj, VZj) for m < j < k are replaced with set(Zj, VZj). Notice that the value of the cause is modiﬁed, while the subsequent values are forced to stay as they were. Then D |= V (Zk, VZk, σ ). Observe that the tuple ((Zm+1, . . . , Zk 1), (VZm+1, . . . , VZk 1), V Zm) constitutes a witness to the fact that (Zm = VZm) is an actual cause of (Zk =VZk) according to HPm. This argument extends by induction to causal chains of arbitrary lengths. Assume that σ , ψ(s) is a causal setting generated as the result of discovering a sequence of achievement causes get(Y1, VY1), . . . , get(Yt, VYt) via Definition 3 such that the occurrence of get(Yj, VYj) precedes the occurrence of get(Yj+1, VYj+1) in σ, i.e., get(Y1, VY1) occurs earlier than subsequent causes. The formula ψ(s) is the conjunction Det(Y1, VY1, s) t j=2 H(j) where each conjunct H(j), j > 1, is the result of recursively applying single-step regression to Det(Yj, VYj, s) over the actions get(Yj 1, VYj 1), . .., get(Y1, VY1) in order, because regression is done over later actions before it is done over earlier actions. This syntactic operation merely replaces the subexpression V (y, v , s) in the deﬁnition of Det(Yj, VYj, s) by the expression j 1 l=1 (y = Yl v = VYl) V (y, v , s), effectively binding the values of the causal chain discovered so far to the arguments of FYj. Now, by the previous argument, if σ , ψ(s) has a primary achievement cause, then it is some action get(Y0, VY0) which eliminates the ambiguity due to the corresponding structural equation in one of the conjuncts of ψ(s). (Note that, by an obvious property of Deﬁnition 3, the achievement causal chain of a conjunction contains the achievement causes of all conjuncts.) Therefore, we can exploit the ambiguity and construct an alternative situation where ψ(s) is falsiﬁed and extract a HPm witness to this fact. A straight-forward elaboration extends this argument to general (non-atomic) HP queries.

Example 1 (cont.). Consider a translation of the disjunctive Forest Fire causal model Md. Recall that neither (MD = true) nor (L = true) alone are singleton actual causes in Md, but (MD=true) (L=true) is an actual cause. Notice that in a previous SC formalization the actions should be renamed to match our translation rules. Namely, replace l with get(L, true), md with get(MD, true), ff with

get(FF, true). The corresponding terminal narratives σ are

do([get(MD, true), get(L, true), get(FF, true)], S0), do([get(L, true), get(MD, true), get(FF, true)], S0), do([get(MD, true), get(FF, true), get(L, true)], S0), do([get(L, true), get(FF, true), get(MD, true)], S0).

The action get(MD, true) is a part of the causal chain of σ, V (FF, true, s) only for the ﬁrst and third choice of σ. Similarly, get(L, true) is an achievement cause only for the second and fourth choice. By Part 1 of Theorem 2, they are not actual causes according to HPm. By Part 2 of Theorem 2, they are both parts of an actual cause according to HPm.

6 Discussion As discussed above, our approach shifts the focus away from causal models and towards ﬁrst order logic representation of the underlying dynamics of the scenario. There are other attempts to step away from HP s treatment of actual causality (Vennekens, Bruynooghe, and Denecker 2010; Vennekens 2011; Beckers and Vennekens 2012; 2016), but they fail to overcome the expressivity limitations. To our knowledge, the only attempt to lift these limitations was undertaken by (Hopkins 2005; Hopkins and Pearl 2007) who reformulate causal models in the language of situation calculus. In doing so, they arbitrarily designate some causal model variables as transitional and model them as actions, and others as enduring and model them as ﬂuents. In contrast, our translation is systematic and requires no additional modelling decisions. (Hopkins and Pearl 2007) preserve the implicit possible worlds semantics of causal formulas as a layer on top of the (many-sorted version of) standard ﬁrst order Tarskian semantics of SC and drop the requirement that situations be executable. The latter is especially problematic, since dismissing preconditions results in paradoxes. As an example, consider a BAT modelling the popular Blocks World domain, where the action move(x, y) stacks block x on top of block y and is possible only if there are no blocks on top of x and y; the action move To T(x) unstacks x and moves it to the table that can hold any number of blocks; the ﬂuent Clear(x, s) states that there is no block on top of block x in situation s; and the ﬂuent On(x, y, s) states that block x is on the top of block y in s. By purging the precondition for move(x, y) from the theory, it is easy to obtain a paradoxical situation σ = do([move(A, B), move(C, B), move To T(C)], S0) where the theory entails both Clear(B, σ) and On(A, B, σ). In this case, the query about the presence of something on top of B may yield two opposite answers, depending on how the modeller phrases it. We doubt that one can build a robust deﬁnition of actual causality on such shaky foundations. (Hopkins and Pearl 2007) neither attempted to give a formal deﬁnition of actual causality, nor provided connections with the causal models approach, as we did. Curiously, (Vennekens, Bruynooghe, and Denecker 2010) consider SC to be too expressive, stating that SC contains many features that go beyond what is traditionally expressed in a causal model. For typical causal reasoning problems, these features are not needed . To refute this statement and

to see where we stand with respect to other approaches, let us consider two telling examples featured in (Beckers and Vennekens 2012). Assume all ﬂuents are false at S0.

Example 2. Assassin poisons victim s coffee, victim drinks it and dies. If assassin had not poisoned the coffee, his backup would have, and victim would still have died. This example from (Hitchcock 2007) illustrates early preemption, namely that the causal link from the backup to victim s death is preempted by the assassin before the effect from backup s action can occur. Let the actions be assassin and backup (the two acts of poisoning the coffee) and selfexplanatory drink. Let the ﬂuents be P(s) meaning coffee contains poison and D(s) meaning the victim is dead .

Poss(assassin, s), Poss(backup, s), Poss(drink, s) P(s), P(do(a, s)) a=assassin a=backup P(s), D(do(a, s)) [a=drink P(s)] D(s).

The narrative σ = do([assassin, drink], S0) describes the given scenario and D |= D(σ). By our analysis, all of σ is an achievement causal chain. This agrees with HP and (Hitchcock 2007) but disagrees with Beckers and Vennekens who believe that assassin is not an actual cause. Rather than appeal to intuition, we just point out that the causal roles assumed by the assassin and his backup are clearly distinct in the given scenario, and were recognized as such by our analysis. Regardless of how reliable the assassin s backup is, he played no role. The action assassin explains exactly how the victim died; it is an actual cause.

Example 3. An engineer is standing by a switch in the railroad track. A train approaches in the distance. She ﬂips the switch, so that the train travels down the left-hand track instead of the right. Since the tracks re-converge up ahead, the train arrives at its destination all the same. This example is proposed by N.Hall (Hall 2000; Paul and Hall 2013) to illustrate the distinction between causation and determination of a causal route; its variants are discussed in many publications (Pearl 2000; Halpern and Pearl 2005; Weslake 2013). Beckers and Vennekens point out that this example is isomorphic to the previous one, except that the intuition about its causes is the polar opposite of that in Assassin . As we shall see, this dilemma is illusory, and the two examples are isomorphic only within the limited expressivity bounds of causal models and CP-logic. In Assassin , there are two competing actions, whereas here there is an action and its absence, a distinction which SC is well equipped to capture. Let the ﬂuent In(s) mean that the train is on the section of the track leading to the ﬁrst junction, let L(s) (resp., R(s)) mean that it is on the left-hand track (resp., right), and let Out(s) mean that it is on the section of the track past the second junction. Let the ﬂuent Sw(s) mean that the switch is engaged and Arrived(s) that the train has arrived. Let the actions be flip (engineer ﬂips the switch), fork1 (train passes ﬁrst junction), fork2 (train passes second junction),

and arrive (self-explanatory). Let only In(s) hold at S0.

Poss(flip, s), Poss(fork1, s) In(s), Poss(fork2, s) L(s) R(s), Poss(arrive, s) Out(s), In(do(a, s)) In(s) a =fork1, L(do(a, s)) a=fork1 Sw(s) L(s) a =fork2, R(do(a, s)) a=fork1 Sw(s) R(s) a =fork2, Out(do(a, s)) a=fork2 Out(s), Sw(do(a, s)) a=flip Sw(s) a =flip, Arrived(do(a, s)) a=arrive Arrived(s).

The narrative σ is do([flip, fork1, fork2, arrive], S0). The causal setting σ, Arrived(s) has a cause arrive. The next setting is do([flip, fork1, fork2], S0), Out(s) with a cause fork2. Computing ρ[Out(s), fork2] Poss(fork2, s) by Deﬁnition 3 yields a new setting do([flip, fork1], S0), L(s) R(s) with a cause fork1. The ﬁnal setting is do([flip], S0), ψ(s) where ψ(s) is (Sw(s) L(s)) ( Sw(s) R(s)) which is a tautology and yields no further causes. Therefore, the flip action is not an actual cause of train s arrival in a faithful SC model of this example, no matter whether the action flip is executed or not. This conclusion is elaboration tolerant (Mc Carthy 1998) as long as the relation between L, R, Sw is preserved. For HP, the answer depends on how the model is constructed and which deﬁnition is applied. (Pearl 2000) calls this class of problems switching causation and argues that ﬂipping the switch is a cause (see Section 10.3.4, p.324 5). In a simpliﬁed setting with the propositional causal variables Sw, L, R, Arrived, consider the equations Sw := true, L := Sw, R := Sw, Arrived := L R. According to the original and updated deﬁnitions of actual cause, both (Pearl 2000) and (Halpern and Pearl 2005) argue that the switch is a cause. But according to the HPm deﬁnition, the switch is not a cause (Halpern 2016). If we start with this reduced causal model with four variables and translate it to a BAT as proposed in Section 5, we would get the same conclusion as in HPm, i.e., Theorem 2 applies here too.

The treatment of causality in (Vennekens 2011) is somewhat clouded by using probabilistic rules of the CP-logic, but in fact actual causality can be deﬁned without appeal to probability (Halpern and Pearl 2005; Halpern 2016). (Beckers and Vennekens 2016) introduce concepts of dependence, contribution and production to deﬁne basic principles for analysis of actual causality, but their language remains inexpressive with no distinction between properties and actions, and quantiﬁed effects are not allowed either.

Example 4. For an example of a quantiﬁed query, consider a world with the blocks {B1, B2, B3, . . .} axiomatized so that they form an inﬁnite chain. Let the ﬂuents be On(x,y,s), block x is on block y, Clear(x,s), x is clear, On Table(x,s), x is on the table. Let the actions be move(x,y), move x on

y, move To T(x), move x to the table.

Poss(move(x, y), s) Clear(x, s) Clear(y, s), Poss(move To T(x), s) Clear(x, s) y On(x, y, s), On(x, y, do(a, s)) a=move(x, y) On(x, y, s) z(a=move(x, z)) a =move To T(x), On Table(x, do(a, s)) a=move To T(x) On Table(x, s) y (a=move(x, y) ), Clear(x, do(a, s)) y, z(a=move(y, z) On(y,x,s)) y(a=move To T(y) On(y, x, s)) Clear(x, s) y(a=move(y, x)).

Let us assume that initially all blocks are on the table. To show that we can handle quantiﬁed causal queries, consider the narrative σ = do([move(B1, B2), move(B1, B3)], S0) and the effect x(on(B1, x, s)). It is easy to see that according to our deﬁnition, the ﬁrst action move(B1, B2) is an actual cause of the effect, while the second action is not, since it was preempted by the ﬁrst action. In addition to actual achievement causes, it is natural to consider actual maintenance causes. These are the causes responsible for protecting a previously achieved effect, despite potential threats that could destroy the effect. For example, a mitigating action serves as a maintenance cause when it is executed before a threat occurs in a narrative. Our paper (Batusov and Soutchanski 2017) investigates how the notions of achievement and maintenance causes can be combined together into a general deﬁnition of an actual cause. SC includes foundational axioms Σ formulated in secondorder logic. However, according to Theorem 1 in (Pirri and Reiter 1999), a BAT D is satisﬁable iff DS0 UNA is. Additionally, according to Theorem 3 in their paper (Regression Theorem 4.5.5 in (Reiter 2001)), a regressable sentence is entailed by a BAT D iff the regressed sentence is entailed by DS0 UNA alone, and if DS0 is formulated in ﬁrst order logic (FOL) then this can be reduced to theorem proving in FOL. Moreover, as Reiter argues in Section 4.8 of (Reiter 2001)), in practical applications, when actions have unconditional effects, or when context conditions are situation-free formulas that are decidable wrt DS0 UNA, then the computational complexity of answering projection queries using regression adds at most linear complexity to the complexity of evaluating ground ﬂuents wrt DS0 UNA. The decidability condition is easily satisﬁed in the case when the object domain is ﬁnite, but there are other fragments of SC where the projection problem is reduced to a decidable entailment problem. (Eiter and Lukasiewicz 2002) established that the complexity of deciding whether X = x is a cause of Y = y is NP-complete in Boolean causal models using an older deﬁnition of actual cause. More recently, (Aleksandrowicz et al. 2017) explored the complexity of computing actual causes according to the modiﬁed deﬁnition HPm.

7 Concluding Remarks Despite its ingenuity and demonstrated utility, the HP analysis based on causal models has its drawbacks. There exist multiple examples for which the results of the HP approach cannot be reconciled with intuitive understanding

which, incidentally, the approach treats as the only measure of merit. This problem was traced by (Hopkins and Pearl 2007) and (Glymour et al. 2010) to the limited expressiveness of causal models. A weakness speciﬁc to the HP approach, pointed out by (Glymour et al. 2010) and somewhat mended by (Beckers and Vennekens 2012), stems from its disregard for the order of events which transpire in the given scenario. Such valuable information should not be discarded without a good reason. We believe that this is the essential methodological difference between their approach and ours. Our work reaps the beneﬁts which (Hopkins and Pearl 2007) aimed at in their choice of situation calculus as the modelling language, but does not suffer from the issues associated with attempting a meaningful deﬁnition of a counterfactual in situation calculus, which appears to be no easy task. A counterfactual query not relativized to a particular scenario can be formulated in situation calculus without appealing to special tools (Lin and Soutchanski 2011), but it is not clear how such queries can be useful for deﬁning actual causality. An original study conducted in (Costello and Mc Carthy 1999) perhaps comes closest to a good deﬁnition of a counterfactual in situation calculus, but it operates outside of the well-studied basic action theories and is not concerned with actual causality. The seminal line of enquiry into actual causality stems from Hume and includes such works as (Mackie 1965) and (Lewis 1974). Aside from the aforementioned works, original computational accounts of actual causality are rare, owing, perhaps, to the ubiquity and the appealing simplicity of causal models. There exist numerous studies of the semantics of causal models and the relationship of causal models to various logics, such as an elaborate axiomatization of causal models (Halpern 2000) and a logical representation (Bochman and Lifschitz 2015) of causal models in a non-monotonic logic which encompasses general causation as a foundational principle. The approach of (Finzi and Lukasiewicz 2003) combines causal models with independent choice logic. It is clear that a broader deﬁnition of actual cause requires more expressive action theories that can model not only sequences of actions, but can also include explicit time and concurrent actions. Only after that one can try to analyze some of the popular examples of actual causation formulated in philosophical literature. Some of those examples sound deceptively simple, but faithful modelling of them requires time, concurrency and natural actions (Reiter 2001). This does not imply that future research should focus only on popular scenarios proposed by philosophers. To the contrary, we ﬁrmly believe that the future of causal research is in elaborating computational methodology for the analysis of complex technical systems, e.g., see (Halpern 2016).

Acknowledgements

We thank the Natural Sciences and Engineering Research Council of Canada for ﬁnancial support.

References Aleksandrowicz, G.; Chockler, H.; Halpern, J. Y.; and Ivrii, A. 2017. The computational complexity of structure-based causality. J. Artif. Intell. Res. 58:431 451. Batusov, V., and Soutchanski, M. 2017. Situation calculus semantics for actual causality. In 13th International Symposium on Commonsense Reasoning. University College London, UK. Monday, November 6. Beckers, S., and Vennekens, J. 2012. Counterfactual dependency and actual causation in CP-logic and structural models: a comparison. In Proceedings of the Sixth Starting AI Researchers Symposium, volume 241, 35 46. Beckers, S., and Vennekens, J. 2016. A principled approach to deﬁning actual causation. Synthese. DOI 10.1007/s11229-016-1247-1. Bochman, A., and Lifschitz, V. 2015. Pearl s causality in a logical setting. In Proceedings of the Twenty-Ninth AAAI Conference on Artiﬁcial Intelligence, AAAI 15, 1446 1452. AAAI Press. Costello, T., and Mc Carthy, J. 1999. Useful counterfactuals. Electron. Trans. Artif. Intell. 3(A):51 76. Eiter, T., and Lukasiewicz, T. 2002. Complexity results for structure-based causality. Artif. Intell. 142(1):53 89. Finzi, A., and Lukasiewicz, T. 2003. Structure-based causes and explanations in the independent choice logic. In Proceedings of the Nineteenth Conference on Uncertainty in Artiﬁcial Intelligence, UAI 03, 225 323. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Glymour, C.; Danks, D.; Glymour, B.; Eberhardt, F.; Ramsey, J.; Scheines, R.; Spirtes, P.; Teng, C. M.; and Zhang, J. 2010. Actual causation: a stone soup essay. Synthese 175(2):169 192. Hall, N. 2000. Causation and the price of transitivity. Journal of Philosophy 97(4):198 222. Halpern, J. Y., and Pearl, J. 2005. Causes and explanations: A structural-model approach. Part I: Causes. The British Journal for the Philosophy of Science 56(4):843 887. Halpern, J. Y. 2000. Axiomatizing causal reasoning. J. Artif. Intell. Res. (JAIR) 12:317 337. Halpern, J. Y. 2015. A modiﬁcation of the Halpern-Pearl deﬁnition of causality. In Proceedings of the Twenty-Fourth International Joint Conference on Artiﬁcial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, 3022 3033. Halpern, J. Y. 2016. Actual Causality. The MIT Press, ISBN 9780262035026. Hitchcock, C. 2007. Prevention, preemption, and the principle of sufﬁcient reason. The Philosophical Review 116(4):495 532. Hopkins, M., and Pearl, J. 2007. Causality and counterfactuals in the situation calculus. Journal of Logic and Computation 17(5):939 953. Hopkins, M. 2005. The Actual Cause: From Intuition to Automation. Ph.D. Dissertation, University of California Los Angeles.

Lewis, D. 1974. Causation. The Journal of Philosophy 70(17):556 567. Lin, F., and Reiter, R. 1994. State constraints revisited. Journal of logic and computation 4(5):655 677. Lin, F., and Soutchanski, M. 2011. Causal theories of actions revisited. In AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning. Mackie, J. L. 1965. Causes and conditions. American Philosophical Quarterly 2(4):245 264. Mc Carthy, J., and Hayes, P. 1969. Some philosophical problems from the standpoint of artiﬁcial intelligence. Machine Intelligence 4:463 502. Mc Carthy, J. 1998. Elaboration tolerance. In Proc. of the 4th Symposium on Logical Formalizations of Commonsense Reasoning, 198 216. Queen Mary and Westﬁeld College, University of London, UK. http://www-formal.stanford. edu/jmc/elaboration/. Menzies, P. 2014. Counterfactual theories of causation. In Stanford Encyclopedia of Philosophy, https://plato.stanford. edu/entries/causation-counterfactual/. Retrieved on January 15, 2017. Paul, L., and Hall, N. 2013. Causation: a user s guide. Oxford University Press, ISBN 978-0199673452. Pearl, J. 1998. On the deﬁnition of actual cause. Technical report, R-259, University of California Los Angeles. Pearl, J. 2000. Causality: Models, Reasoning, and Inference. Cambridge University Press, 1st edition. Pirri, F., and Reiter, R. 1999. Some contributions to the metatheory of the situation calculus. Journal of the ACM (JACM) 46(3):325 361. Reiter, R. 1991. The frame problem in the situation calculus: A simple solution (sometimes) and a completeness result for goal regression. Artiﬁcial intelligence and mathematical theory of computation: papers in honor of John Mc Carthy 359 380. Reiter, R. 2001. Knowledge in action: logical foundations for specifying and implementing dynamical systems. The MIT Press, ISBN: 9780262527002. Simon, H. A. 1953. Causal ordering and identiﬁability. In Hood, W., and Koopmans, T., eds., Studies in Econometric Methods. New York: Wiley. chapter 3, 49 74. Simon, H. A. 1977. Causal ordering and identiﬁability. In Models of Discovery. Dordrecht: D.Reidel/Springer. 53 80. Vennekens, J.; Bruynooghe, M.; and Denecker, M. 2010. Embracing events in causal modelling: Interventions and counterfactuals in CP-logic. In European Workshop on Logics in Artiﬁcial Intelligence, 313 325. Springer. Vennekens, J. 2011. Actual causation in CP-logic. Theory and Practice of Logic Programming 11(4-5):647 662. Weslake, B. 2013. A partial theory of actual causation. In http://bweslake.s3.amazonaws.com/research/ papers/weslake ac.pdf. Retrieved on July 18, 2017.