# the_computational_complexity_of_structurebased_causality__23ef8354.pdf Journal of Artificial Intelligence Research 58 (2017) 431-451 Submitted 6/16; published 02/17 The Computational Complexity of Structure-Based Causality Gadi Aleksandrowicz GADIA@IL.IBM.COM IBM Research Lab Haifa, Israel Hana Chockler HANA.CHOCKLER@KCL.AC.UK Department of Informatics King s College London, UK Joseph Y. Halpern HALPERN@CS.CORNELL.EDU Computer Science Department Cornell University, Ithaca, NY 14853, USA Alexander Ivrii ALEXI@IL.IBM.COM IBM Research Lab Haifa, Israel Halpern and Pearl introduced a definition of actual causality; Eiter and Lukasiewicz showed that computing whether X = x is a cause of Y = y is NP-complete in binary models (where all variables can take on only two values) and ΣP 2 -complete in general models. In the final version of their paper, Halpern and Pearl slightly modified the definition of actual cause, in order to deal with problems pointed out by Hopkins and Pearl. As we show, this modification has a nontrivial impact on the complexity of computing whether X = x is a cause of Y = y. To characterize the complexity, a new family DP k , k = 1, 2, 3, . . ., of complexity classes is introduced, which generalizes the class DP introduced by Papadimitriou and Yannakakis (DP is just DP 1 ).We show that the complexity of computing causality under the updated definition is DP 2 -complete. Chockler and Halpern extended the definition of causality by introducing notions of responsibility and blame, and characterized the complexity of determining the degree of responsibility and blame using the original definition of causality. Here, we completely characterize the complexity using the updated definition of causality. In contrast to the results on causality, we show that moving to the updated definition does not result in a difference in the complexity of computing responsibility and blame. 1. Introduction There have been many attempts to define causality going back to Hume (1739), and continuing to the present (see, e.g., Collins, Hall, & Paul, 2004 and Pearl, 2000 for some recent work). The standard definitions of causality are based on counterfactual reasoning. Very roughly speaking, this means that A is a cause of B if (counterfactually) had A not occurred, then B would not have occurred (although, in fact, both A and B did occur). In this paper, we focus on one such definition, due to Halpern and Pearl, that has proved quite influential recently. The definition was originally introduced in 2001 (Halpern & Pearl, 2001), but then modified in the final journal version (Halpern & Pearl, 2005a) to deal with problems pointed out by Hopkins c 2017 AI Access Foundation. All rights reserved. ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII and Pearl (2003). (For ease of reference, we call these definitions the original HP definition and the updated HP definition in the sequel.) In general, what can be a cause in both the original HP definition and the updated definition is a conjunction of the form X1 = x1 . . . Xk = xk, abbreviated X = x; what is caused can be an arbitrary Boolean combination ϕ of formulas of the form (Y = y). This should be thought of as saying that the fact that X1 = x1 and . . . and Xk = xk is the cause of ϕ being true. As shown by Eiter and Lukasiewicz (2002) and Hopkins (2001), under the original HP definition, we can always take causes to be single conjuncts, that is, expressions of the form X = x (also referred to as singletons through the paper). However, as shown by Halpern (2008), this is not the case for the updated HP definition. Using the fact that causes can be taken to be single conjuncts, Eiter and Lukasiewicz (2002) showed that deciding causality (i.e., deciding whether X = x is a cause of ϕ) is NP-complete in binary models (where all variables can take on only two values) and ΣP 2 -complete in general models.1 As we show here, this is no longer the case for the updated HP definition. Indeed, we completely characterize the complexity of causality for the updated HP definition. To do so, we introduce a new family of complexity classes that may be of independent interest. Papadimitriou and Yannakakis (1984) introduced the complexity class DP , which consists of all languages L3 such that there exists a language L1 in NP and a language L2 in co-NP such that L3 = L1 L2. We generalize this by defining DP k to consist of all languages L3 such that there exists a language L1 ΣP k and a language L2 ΠP k such that L3 = L1 L2.2 Since ΣP 1 is NP and ΠP 1 is co-NP, DP 1 is Papadimitriou and Yannakakis s DP . We then show that deciding causality under the updated HP definition is DP 2 complete. Papadimitriou and Yannakakis (1984) showed that a number of problems of interest were DP complete, both for binary and general causal models. To the best of our knowledge, this is the first time that a natural problem has been shown to be complete for DP 2 . Although, in general, causes may not be single conjuncts, as observed by Halpern (2008), in many cases (in particular, in all the standard examples studied in the literature), they are. In an effort to understand the extent to which the difficulty in deciding causality stems from the fact that causes may require several conjuncts, we consider what we call the singleton cause problem; that is, the problem of deciding if X = x is a cause of ϕ (i.e., where there is only a single conjunct in the cause). We show that the singleton cause problem is simpler than the general causality problem (unless the polynomial hierarchy collapses): it is ΣP 2 complete for both binary and general causal models. Thus, if we restrict to singleton causes (which we can do without loss of generality under the original HP definition), the complexity of deciding causality in general models is the same under the original and the updated HP definition, but in binary models, it is still simpler under the original HP definition. Causality is a 0 1 concept; X = x is either a cause of ϕ or it is not. Now consider two voting scenarios: in the first, Mr. G beats Mr. B by a vote of 11 0. In the second, Mr. G beats Mr. B by a vote of 6 5. According to both the original and the updated HP definition, all the people who voted for Mr. G are causes of him winning. While this does not seem so unreasonable, it does not capture the intuition that each voter for Mr. G is more critical to the victory in the case of the 6 5 vote than in the case of the 11 0 vote. The notion of degree of responsibility, introduced by Chockler and Halpern (2004), does so. The idea is that the degree of responsibility of X = x for ϕ is 1/(k + 1), 1. Note that we said deciding whether X = x is a cause of ϕ ; in general, there may be more than one cause of ϕ. 2. DP = BH2; that is, DP is, by definition, level 2 in the Boolean hierarchy (Chang & Kadin, 1996). However, it is not hard to show that Dp k is not in the Boolean hierarchy for k > 1. THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY where k is the least number of changes that have to be made in order to make X = x critical. In the case of the 6 5 vote, no changes have to be made to make each voter for Mr. G critical for Mr. G s victory; if he had not voted for Mr. G, Mr. G would not have won. Thus, each voter has degree of responsibility 1 (i.e., k = 0). On the other hand, in the case of the 11 0 vote, for a particular voter to be critical, five other voters have to switch their votes; thus, k = 5, and each voter s degree of responsibility is 1/6. This notion of degree of responsibility has been shown to capture (at a qualitative level) the way people allocate responsibility (Gerstenberg & Lagnado, 2010; Lagnado, Gerstenberg, & Zultan, 2013). Chockler and Halpern further extended the notion of degree of responsibility to degree of blame. Formally, the degree of blame is the expected degree of responsibility. This is perhaps best understood by considering a firing squad with ten excellent marksmen. Only one of them has live bullets in his rifle; the rest have blanks. The marksmen do not know which of them has the live bullets. The marksmen shoot at the prisoner and he dies. The only marksman that is the cause of the prisoner s death is the one with the live bullets. That marksman has degree of responsibility 1 for the death; all the rest have degree of responsibility 0. However, each of the marksmen has degree of blame 1/10. The complexity of determining the degree of responsibility and blame using the original definition of causality was completely characterized (Chockler & Halpern, 2004; Chockler, Halpern, & Kupferman, 2008). Again, we show that changing the definition of causality affects the complexity, and completely characterize the complexity of determining the degree of responsibility and blame with the updated definition. The rest of this paper is organized as follows. In Section 2, we review the relevant definitions of causality. In Section 3, we briefly review the relevant definitions from complexity theory and define the complexity classes DP k . In Section 4 we prove our results on complexity of causality. We conclude in Section 6. Some proofs are deferred to the appendices. 2. Causal Models and Causality: A Review In this section, we review the details of Halpern and Pearl s definition of causal models and causality, describing both the original definition and the updated definition. This material is largely taken from (Halpern & Pearl, 2005a), to which we refer the reader for further details. 2.1 Causal Models A signature is a tuple S = U, V, R , where U is a finite set of exogenous variables, V is a finite set of endogenous variables, and R associates with every variable Y U V a finite nonempty set R(Y ) of possible values for Y . Intuitively, the exogenous variables are ones whose values are determined by factors outside the model, while the endogenous variables are ones whose values are ultimately determined by the exogenous variables. A causal model over signature S is a tuple M = S, F , where F associates with every endogenous variable X V a function FX such that FX : ( U UR(U) ( Y V\{X}R(Y ))) R(X). That is, FX describes how the value of the endogenous variable X is determined by the values of all other variables in U V. If R(Y ) contains only two values for each Y U V, then we say that M is a binary causal model. We can describe (some salient features of) a causal model M using a causal network. A causal network is a graph with nodes corresponding to the random variables in V and an edge from a node labeled X to one labeled Y if FY depends on the value of X. Intuitively, variables can have a causal effect only on their descendants in the causal network; if Y is not a descendant of X, then a change ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII in the value of X has no affect on the value of Y . For ease of exposition, we restrict attention to what are called recursive models. These are ones whose associated causal network is a directed acyclic graph (that is, a graph that has no cycle of edges). Actually, it suffices for our purposes that, for each setting u for the variables in U, there is no cycle among the edges of the causal network. We call a setting u for the variables in U a context. If M is a recursive causal model, then there is always a unique solution to the equations in M (given by false), in a given context. The equations determined by {FX : X V} can be thought of as representing processes (or mechanisms) by which values are assigned to variables. For example, if FX(Y, Z, U) = Y + U (which we usually write as X = Y + U), then if Y = 3 and U = 2, then X = 5, regardless of how Z is set. This equation also gives counterfactual information. It says that, in the context U = 4, if Y were 4, then X would be 8, regardless of what value X and Z actually take in the real world. That is, if U = 4 and the value of Y were set to 4 (regardless of its actual value), then the value of X would be 8. While the equations for a given problem are typically obvious, the choice of variables may not be. Consider the following example due to Hall (2004), showing that the choice of variables influences the causal analysis. Suppose that Suzy and Billy both pick up rocks and throw them at a bottle. Suzy s rock gets there first, shattering the bottle. Since both throws are perfectly accurate, Billy s would have shattered the bottle had Suzy not thrown. In this case, a naive model might have an exogenous variable U that encapsulates whatever background factors cause Suzy and Billy to decide to throw the rock (the details of U do not matter, since we are interested only in the context where U s value is such that both Suzy and Billy throw), a variable ST for Suzy throws (ST = 1 if Suzy throws, and ST = 0 if she doesn t), a variable BT for Billy throws, and a variable BS for bottle shatters. In the naive model, whose graph is given in Figure 1, BS is 1 if one of ST and BT is 1. Figure 1: A naive model for the rock-throwing example. This causal model does not distinguish between Suzy and Billy s rocks hitting the bottle simultaneously and Suzy s rock hitting first. A more sophisticated model might also include variables SH and BH, for Suzy s rock hits the bottle and Billy s rock hits the bottle. It is immediate from the equations that BS is 1 iff one of SH and BH is 1. However, now, SH is 1 if ST is 1, and BH = 1 if BT = 1 and SH = 0. Thus, Billy s throw hits if Billy throws and Suzy s rock doesn t hit. This model is described by the following graph, where we implicitly assume a context where Suzy throws first, so there is an edge from SH to BH, but not one in the other direction (and omit the exogenous variable). Given a causal model M = (S, F), a (possibly empty) vector X of variables in V, and a vector x of values for the variables in X, we define a new causal model, denoted M X x, which is identical to M, except that the equation for the variables X in F is replaced by X = x. Intuitively, this is the THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY p p ? ? - S SSw Figure 2: A better model for the rock-throwing example. causal model that results when the variables in X are set to x by some external action that affects only the variables in X (and overrides the effects of the causal equations). For example, if M is the more sophisticated model for the rock-throwing example, then MST 0 is the model where Suzy doesn t throw. Given a signature S = (U, V, R), a formula of the form X = x, for X V and x R(X), is called a primitive event. A basic causal formula has the form [Y1 y1, . . . , Yk yk]ϕ, where ϕ is a Boolean combination of primitive events; Y1, . . . , Yk are distinct variables in V; and Such a formula is abbreviated as [ Y y]ϕ. The special case where k = 0 is abbreviated as ϕ. Intuitively, [Y1 y1, . . . , Yk yk]ϕ says that ϕ holds in the counterfactual world that would arise if Yi is set to yi, for i = 1, . . . , k. A causal formula is a Boolean combination of basic causal formulas. A causal formula ϕ is true or false in a causal model, given a context. We write (M, u) |= ϕ if ϕ is true in causal model M given context u. (M, u) |= [ Y y](X = x) if the variable X has value x in the unique (since we are dealing with recursive models) solution to the equations in M Y y in context u (i.e., the unique vector of values for the exogenous variables that simultaneously satisfies all equations F Y y Z , Z V Y , with the variables in U set to u). We extend the definition to arbitrary causal formulas in the obvious way. 2.2 Causality We now review the updated HP definition of causality. Definition 2.1 X = x is a cause of ϕ in (M, u) if the following three conditions hold: AC1. (M, u) |= ( X = x) ϕ. AC2. There exist a partition ( Z, W) of V with X Z and some setting ( x , w) of the variables in ( X, W) such that if (M, u) |= Z = z for Z Z, then (a) (M, u) |= [ X x , W w] ϕ. (b) (M, u) |= [ X x, W w, Z z ]ϕ for all subsets Z of Z \ X and all subsets W of W, where we abuse notation and write W w to denote the assignment where the variables in W get the same values as they would in the assignment W w, and similarly for Z z . That is, setting any subset W of W to the values in w should ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII have no effect on ϕ as long as X has the value x, even if all the variables in an arbitrary subset of Z are set to their values in the context u. The tuple ( W, w, x ) is said to be a witness to the fact that X = x is a cause of ϕ. AC3. ( X = x) is minimal; no subset of X satisfies AC2. If X is a singleton, then X = x is said to be a singleton cause of ϕ in (M, u). AC1 just says that A cannot be a cause of B unless both A and B are true. The core of this definition lies in AC2. The variables in Z can be thought of as the variables on a causal path between the variables in X and the variables in ϕ. While this is not always the case (see (Halpern, 2016, Section 2.9) for a counterexample), it is typically the case, and is a helpful intuition.3 AC2(a) is reminiscent of the traditional counterfactual criterion, according to which X = x is a cause of ϕ if changing the value of X results in ϕ being false. However, AC2(a) is more permissive than the traditional criterion; it allows the dependence of ϕ on X to be tested under special structural contingencies, in which the variables W are held constant at some setting w. AC2(b) is an attempt to counteract the permissiveness of AC2(a) with regard to structural contingencies. Essentially, it ensures that X alone suffices to bring about the change from ϕ to ϕ; setting W to w merely eliminates spurious side effects that tend to mask the action of X. To understand the role of AC2(b), consider the rock-throwing example again. Let M be the model in Figure 1, and let u be the context where both Suzy and Billy throw. It is easy to see that both Suzy and Billy are causes of the bottle shattering in (M, u): Let Z = {ST, BS}, and consider the structural contingency where Billy doesn t throw (BT = 0). Clearly (M, U) |= [ST 0, BT 0](BS = 0) and (M, u) |= [ST 1, BT 0](BS = 1), so Suzy is a cause of the bottle shattering. A symmetric argument shows that Billy is also a cause. But now consider the model M described in Figure 2; again, u is the context where both Suzy and Billy throw. It is still the case that Suzy is a cause of the bottle shattering in (M , u). We can take W = {BT} and again consider the contingency where Billy doesn t throw. However, Billy is not a cause of the bottle shattering in (M , u). For suppose that we now take W = {ST} and consider the contingency where Suzy doesn t throw. AC2(a) holds, since if Billy doesn t throw (under this contingency), then the bottle doesn t shatter. However, AC2(b) does not hold. Since BH Z, if we set BH to 0 (its original value), then AC2(b) would require that (M , u) |= [BT 1, ST 0, BH 0](BS = 1), but this is not the case. Similar arguments show that no other choice of ( Z, W) makes Billy s throw a cause of the bottle shattering in (M , u). The original HP definition differs from the updated definition in only one respect. Rather than requiring that (M, u) |= [ X x, W w, Z z ]ϕ for all subsets W of W, it was required to hold only for W. That is, the following condition was used instead of AC2(b). AC2(b ) (M, u) |= [ X x, W w, Z z ]ϕ for all subsets Z of Z. The requirement for AC2(b) to hold for all subsets of W in the updated definition prevents situations where W conceals other causes for ϕ . The role of this requirement is perhaps best understood by considering the following example, due to Hopkins and Pearl (2003) (the description is taken from Halpern & Pearl, 2005a): Suppose that a prisoner dies either if A loads B s gun and B shoots, or if C loads and shoots his gun. Taking D to represent the prisoner s death and making the 3. Interestingly, In the original HP definition, we can in fact always take Z to consist of variables that lie on a causal path from a variable in X; see (Halpern, 2016, Section 2.9). THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY obvious assumptions about the meaning of the variables, we have that D = (A B) C. Suppose that in the actual context u, A loads B s gun, B does not shoot, but C does load and shoot his gun, so that the prisoner dies. That is, A = 1, B = 0, and C = 1. C = 1 is a cause of D = 1, since if we set C to 0, then D = 0. We would not want to say that A = 1 is a cause of D = 1, given that B did not shoot (i.e., given that B = 0). However, with AC2(b ), A = 1 is a cause of D = 1. For we can take W = {B, C} and consider the contingency where B = 1 and C = 0. It is easy to check that AC2(a) and AC2(b ) hold for this contingency, so under the original HP definition, A = 1 is a cause of D = 1. However, AC2(b) fails in this case, since (M, u) |= [A 1, C 0](D = 0). The key point is that AC2(b) says that for A = 1 to be a cause of D = 1, it must be the case that D = 0 if only some of the values in W are set to w. That means that the other variables get the same value as they do in the actual context; in this case, by setting only A to 1 and leaving B unset, B takes on its original value of 0, in which case D = 0. AC2(b ) does not consider this case. Using AC2(b) rather than AC2(b ) has been shown to have a significant benefit (and to lead to more intuitive results) when causality is applied to program verification, with the goal of understanding what in the code is the cause of a program not satisfying its specification (Beer, Ben-David, Chockler, Orni, & Trefler, 2012). 3. Relevant Complexity Classes In this section, we briefly recall the definitions of the complexity classes that we need for our results, and define the complexity class Dk 2. Recall that the polynomial hierarchy is a hierarchy of complexity classes that generalize the classes NP and co-NP. Let ΣP 1 = NP and ΠP 1 = co-NP. For i > 1, define ΣP i = NPΣP i 1 and ΠP i = (co-NP)ΣP i 1, where, in general, XY denotes the class of problems solvable by a Turing machine in class A augmented with an oracle for a problem complete for class B. (See Sipser, 2012; Stockmeyer, 1977 for more details and intuition.) We now define the classes DP k as follows. Definition 3.1 For k = 1, 2, . . ., DP k = {L : L1, L2 : L1 ΣP k , L2 ΠP k , L = L1 L2}. For k = 1, the class DP 1 is the complexity class DP defined by Papadimitriou and Yannakakis (1984). It contains exact problems such as the language of pairs G, k , where G is a graph that has a maximal clique of size exactly k.4 Note that DP 1 is not equivalent to NP co-NP (and, more generally, DP k is not equivalent to ΣP k ΠP k ). For example, although the exact clique problem is in DP 1 , as we just noted, it is not believed to be in either NP or co-NP, so is not in NP co-NP. As usual, we say that a language L is DP k complete if it is in DP k and is the hardest language in DP k , in the sense that there is a polynomial-time reduction from any language L DP k to L. Recall that a quantified Boolean formula (QBF) is a generalization of a propositional formula, where some propositional variables are quantified. Thus, for example, x y(x y) is a QBF. A closed QBF (CQBF) is one where there are no free propositional variables. A CQBF is either true or false, independent of the truth assignment. The canonical languages complete for Σk 2 and Πk 2 4. The language of graphs that have a clique of size at least k is in NP, since we can guess a set of vertices of size k and check that it is a clique in polynomial time; in fact, it is a well-known NP-complete problem (Garey & Johnson, 1979). The language of graphs that have a clique of size at most k is in co-NP, since its complement is in NP. ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII consist of the CQBFs with k alternations of quantifiers starting with (resp., ) that are true. In particular, let ΣP 2 (SAT) = { X Y ϕ | X Y ϕ is a CQBF, X Y ϕ = true} ΠP 2 (SAT) = { X Y ϕ | X Y ϕ is a CQBF, X Y ϕ = true}. ΣP 2 (SAT) is complete for ΣP 2 and ΠP 2 (SAT) is complete for ΠP 2 (Wrathall, 1976). The following lemma provides a useful condition sufficient for a language to be DP k -complete. Lemma 3.2 If L1 is ΣP k -complete and L2 is ΠP k -complete, then L3 = L1 L2 is DP k -complete. Proof: Without loss of generality, let L1 and L2 be languages over some alphabet Φ. The fact that L3 is in DP k is immediate from the definition of DP k and the observation that L3 = (L1 Φ ) (Φ L2). For hardness, let L 3 be a language in DP k . Then there exist L 1 and L 2 such that L 1 ΣP k , L 2 ΠP k , and L 3 = L 1 L 2. Let f be a polynomial-time (many-to-one) reduction from L 1 to L1, and let g be a polynomial-time (many to one) reduction from L 2 to L2 (the existence of such reductions f and g follows from the fact that L1 and L2 are ΣP k -complete and ΠP k -complete, respectively). Then the reduction f, g : L 3 L1 L2 defined by f, g x, y = f(x), g(x) is a polynomial-time reduction from L 3 to L3 = L1 L2, as required. Determining whether X = x is a cause of ϕ in (M, u) is a decision problem: we define a language and try to determine whether a particular tuple is in that language. (See Section 4 for the formal definition.) Determining degree of responsibility and blame is a different type of problem, since we are determining which number represents the degree of responsibility (resp., blame). Formally, these are function problems. For ease of exposition, we restrict attention to functions from strings over some fixed language Σ to strings over Σ (i.e., we are considering functions from Σ to Σ ). For a complexity class A in the polynomial hierarchy, FPA[log n] consists of all functions that can be computed by a polynomial-time Turing machine with an A-oracle which on input x asks a total of O(log |x|) queries (Papadimitriou, 1984). A function f(x) is FPA[log n]-hard iff for every function g(x) in FPA[log n] there exist polynomially computable functions R, S : Σ Σ such that g(x) = S(f(R(x))). A function f(x) is complete in FPA[log n] iff it is in FPA[log n] and is FPA[log n]-hard. Note that since a ΣP k -oracle gives positive and negative answers in O(1) time, the class FP ΠP k [log n] coincides with FP ΣP k [log n] for all k 1 (see also Krentel, 1988). We remark that, rather than computing the degree of responsibility, which is a function problem, we could have computed the corresponding decision problem is the degree of responsibility of X = x for ϕ in (M, u) at least 1/k . Since X = x is a cause of ϕ in (M, u) iff its degree of responsibility is greater than 0, this problem must be at least as hard as that of determining causality. It is also no harder, as we could limit our consideration of potential sets W in Def. 2.1 to sets of size no larger than k 1 (since the degree of responsibility is defined as 1/(| W| + 1)), making this a variant of the general causality problem, with the reasoning for membership in the relevant complexity classes being the same as that for causality. Hence, it is ΣP 2 -complete for singleton causes and P 2 -complete in general. That said, we believe that the function problem is the more relevant problem for most applications of degree of responsibility. (Similar remarks hold for degree of blame.) Finally, for a complexity class A in polynomial hierarchy, FPA || is the class of functions that can be computed by a polynomial-time Turing machine with parallel (i.e., non-adaptive) queries to an A-oracle. (For background on these complexity classes, see Jenner & Toran, 1995; Johnson, 1990.) THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY 4. Complexity for the Updated HP Definition In this section, we prove our results on the complexity of deciding causality. We start by defining the problem formally. In the definitions, M stands for a causal model, u is a context, X is a subset of variables of M, and x is the set of values of X in (M, u): Lcause = { M, u, ϕ, X, x : ( X = x) is a cause of ϕ in (M, u)}. One of our goals is to understand the cause of the complexity of computing causality. Towards this end, it is useful to define two related languages: LAC2 = { M, u, ϕ, X, x : ( X = x) satisfies AC1 and AC2 of Def. 2.1 for ϕ in (M, u)}, LAC3 = { M, u, ϕ, X, x : ( X = x) satisfies AC1 and AC3 of Def. 2.1 for ϕ in (M, u)}. It is easy to see that Lcause = LAC2 LAC3. Let L1cause be the subset of Lcause where X and x are singletons; this is the singleton causality problem. We can similarly define L1 AC2 and L1 AC3. Again, we have L1cause = L1 AC2 L1 AC3, but, in fact, we have L1cause = L1 AC2, since L1 AC2 L1 AC3; for singleton causality, the minimality condition AC3 trivially holds. We denote by LBcause the language of causality for binary causal models (i.e., where the models M in the tuple are binary models), and by LB AC2 and LB AC3 the languages LAC2 and LAC3 restricted to binary causal models. Again we have that LBcause = LB AC2 LB AC3. And again, we can define LB,1 cause, LB,1 AC2, and LB,1 AC3, and we have LB,1 cause = LB,1 AC2. We start by considering singleton causality. As we observed, Eiter and Lukasiewicz (2002) and Hopkins (2001) showed that, with the original HP definition, singleton causality and causality coincide. However, for the updated definition, Halpern (2008) showed that it is in fact possible to have minimal causes that are not singletons. Thus, we consider singleton causality and general causality separately. We can clarify where the complexity lies by considering LAC2 (and its sublanguages) and LAC3 (and its sublanguages) separately. Theorem 4.1 The languages LAC2, L1 AC2, LB,1 AC2, and L1 AC2 are ΣP 2 -complete. Proof outline: To show that all these languages are in ΣP 2 , given a tuple M, u, ϕ, X, x , checking that AC1 holds, that is, checking that (M, u) |= X = x ϕ, can be done in time polynomial in the size of M, | X|, and |ϕ| (the length of ϕ as a string of symbols). For AC2, we need only guess the set W and the assignment w. The check that assigning w to W and x to X indeed falsifies ϕ is polynomial; we use an NP oracle to check that for all subsets of W and all subsets of Z, condition AC2(b) holds. (The oracle is used to check if these conditions do not hold, which can easily be checked in NP.) The argument is similar in spirit to Eiter and Lukasiewicz s argument that causality is in ΣP 2 for general models with the original HP definition. For hardness, it suffices to show that LB,1 AC2 is ΣP 2 -hard. We do this by reducing ΣP 2 (SAT) to LB,1 AC2. Given a CQBF formula X Y ϕ, we show that we can efficiently construct a causal formula ψ, model M, and context u such that X Y ϕ = true iff (M, u, ψ, A, 0) LB,1 AC2. We leave details to Appendix A. ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII Since, as we have observed, AC3 holds trivially in the case of singleton causality, it follows that singleton causality is ΣP 2 -complete. Corollary 4.2 L1cause and LB,1 cause are Σ2-complete. We now show that things are harder if we do not restrict causes to singletons (unless the polynomial hierarchy collapses). As a first step, we consider the complexity of LAC3 and LB AC3. Theorem 4.3 LAC3 and LB AC3 are ΠP 2 -complete. Proof outline: The fact that LAC3 and LB AC3 are in ΠP 2 is straightforward. Again, given a tuple M, u, ϕ, X, x , we can check that AC1 holds in polynomial time. For AC3, we need to check that for all strict subsets X of X, AC2 fails. Since checking AC2 is in ΣP 2 , checking that it fails is in ΠP 2 . Checking that it fails for all strict subsets X keeps it in ΠP 2 (since it just adds one more universal quantifier). To prove that these languages are ΠP 2 -hard, we show that we can reduce ΠP 2 (SAT) to LB AC3. The proof is similar in spirit to the proof of Theorem 4.1; we leave details to the appendix. We are now ready to prove our main result. Theorem 4.4 Lcause and LBcause are DP 2 -complete. Proof outline: Membership of Lcause (and hence also LBcause) in DP 2 follows from the fact that Lcause = LAC2 LAC3, LAC2 ΣP 2 , and LAC3 ΠP 2 . For hardness, we first consider the language ΣP 2 (SAT) ΠP 2 (SAT), which, by Lemma 3.2 is DP k -complete. We then reduce this language to LBcause using the constructions in Theorems 4.1 and 4.3. We leave details to Appendix B. The fact that there may be more than one conjunct in a cause using the updated HP definition means that checking AC3 becomes nontrivial, and causes the increase in complexity for ΣP 2 to DP 2 . But why is there no dropoff with the updated HP definition when we restrict to binary models, although there is a dropoff from ΣP 2 to NP for the original HP definition? To prove their NPcompleteness result, Eiter and Lukasiewicz (2002) showed that for binary models, with the original HP definition, the set Z and its subsets can be omitted from the definition of cause. That is, we can replace AC2(b ) by AC2(b ) (M, u) |= [ X x, W w]ϕ to get an equivalent definition. The example that a cause may require more than one conjunct given by Halpern (2008) shows that removing Z and its subsets from AC2(b) does not result in an equivalent definition in binary models. But even if it did, the fact that we need to quantify over all subset W of W in AC2(b) would be enough to ensure that there is no dropoff in complexity in binary models. 5. Responsibility and Blame In this section, we review the definitions of responsibility and blame and characterize their complexity. See Chockler and Halpern (2004) for more intuition and details. THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY 5.1 Responsibility The definition of responsibility given by Chockler and Halpern (2004) was given based on the original HP definition of causality, and thus assumed that causes were always single conjuncts. It is straightforward to extend it to allow causes to have arbitrarily many conjuncts. Definition 5.1 The degree of responsibility of X = x for ϕ in (M, u), denoted dr((M, u), ( X = x), ϕ), is 0 if X = x is not a cause of ϕ in (M, u); it is 1/(k + 1) if X = x is a cause of ϕ in (M, u) and there exists a partition ( Z, W) and setting ( x , w) for which AC2 holds such that (a) k variables in W have different values in w than they do in the context u and (b) there is no partition ( Z , W ) and setting ( x , w ) satisfying AC2 such that only k < k variables have different values in w than they do the context u. 5 Intuitively, dr((M, u), ( X = x), ϕ) measures the minimal number of variables whose values have to be changed in order to make ϕ counterfactually depend on X. If there is no partition of V to ( Z, W) that satisfies AC2, or ( X = x) does not satisfy AC3 for ϕ in (M, u), then the minimal number of changes in Definition 5.1 is taken to have cardinality , and thus the degree of responsibility of ( X = x) is 0 (and hence it is not a cause). In the original HP model, it was shown that computing responsibility is FPNP[log n]-complete in binary causal models (Chockler et al., 2008) and FPΣP 2 [log n]-complete in general causal models (Chockler & Halpern, 2004). First, we prove that the FPΣP 2 [log n]-completeness result holds for singleton causes. Theorem 5.2 Computing the degree of responsibility is FPΣP 2 [log n]-complete for singleton causes in binary and general causal models. Proof outline: The proof is quite similar to the proof in (Chockler & Halpern, 2004). We prove membership by describing an algorithm in FPΣP 2 [log n] for computing the degree of responsibility. Roughly speaking, the algorithm queries an oracle for the language R = {( (M, u), (X = x), ϕ, i such that (M, u), (X = x), ϕ L1cause and the degree of responsibility of (X = x) for ϕ is at least i}. It follows easily from Corollary 4.2 that R is ΣP 2 -complete. The algorithm for computing the degree of responsibility performs a binary search on the values of i such that dr((M, u), (X = x), ϕ) i, using R as an oracle, dividing the range of possible values for the degree of responsibility by 2 according to the answer given by R. The number of possible candidates for the degree of responsibility is bounded by n, the size of the input, and thus the number of queries is at most log n . For hardness in binary causal models (which implies hardness in general causal models), we can reduce the FPΣP 2 [log n]-complete problem MINQSAT2 (Chockler & Halpern, 2004) to the degree of responsibility, where MINQSAT2( X Y ψ) is the minimum number of 1 s in the satisfying assignment to X for X Y ψ if such an assignment exists, and | X| + 1 otherwise. The argument is almost the same as that given by Chockler and Halpern in the case of the original HP definition; we refer the reader to (Chockler & Halpern, 2004) for details. 5. This definition is a slightly different from the definition of responsibility in (Halpern, 2016); here we define the degree of responsibility for non-singletons X = x; even for singletons, the definitions are slightly different. That said, the two variants have the same complexity, as can be shown using a slight variant of the argument given in the proof of Theorems 5.3. ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII A perhaps surprising observation is that the complexity of computing the degree of responsibility for general causes is no higher than that for singletons, as the following theorem shows.6 Theorem 5.3 Computing the degree of responsibility is FPΣP 2 [log n]-complete in binary and general causal models. Proof: Hardness follows immediately from hardness of a special case, proven in Theorem 5.2. It remains to show membership in FPΣP 2 [log n]. Recall that the language LAC2, defined in Section 4, is ΣP 2 -complete (see Theorem 4.1). It is easy to show that the following language, which is a modification of LAC2, is in ΣP 2 as well: Li AC2 = { (M, u), ( X = x), ϕ, i : (M, u), ( X = x), ϕ satisfies AC1 and AC2 using a set W such that | W| i}. By Theorem 4.3, the language LAC3 (also defined in Section 4) is ΠP 2 -complete, so its complement LAC3 is ΣP 2 -complete. Thus, the language Li AC2 LAC3 is in ΣP 2 . Let R be an oracle for the language Li AC2 LAC3. The algorithm for computing the degree of responsibility first queries the oracle R for whether (M, u), ( X = x), ϕ in LAC3 (formally, by asking whether (a, (M, u), ( X = x), ϕ ) Li AC2 LAC3, where a is a fixed tuple that is trivially in Li AC2 it is straightforward to construct such a tuple). If the answer is positive, then ( X = x) is not a cause of ϕ in (M, u), as it does not satisfies the minimality condition AC3. Otherwise (i.e., if the answer is negative), the algorithm performs a binary search on the values of i such that dr((M, u), (X = x), ϕ) i, using the oracle R . The number of possible candidates for the degree of responsibility is bounded by n, the size of the input, and thus the number of queries is at most ( log n + 1) (the additional 1 comes from the query to LAC3). The notion of blame addresses the situation where there is uncertainty about the true situation or how the world works . Blame, introduced in (Chockler & Halpern, 2004), considers the true situation to be determined by the context, and how the world works to be determined by the structural equations. An agent s uncertainty is modeled by a pair (K, Pr), where K is a set of pairs of the form (M, u), where M is a causal model and u is a context, and Pr is a probability distribution over K. A pair (M, u) is called a situation. We think of K as describing the situations that the agent considers possible before X is set to x. The degree of blame that setting X to x has for ϕ is then the expected degree of responsibility of X = x for ϕ in (M X x, u), taken over the situations (M, u) K. Note that the situations (M X x, u) for (M, u) K are those that the agent considers possible after X is set to x. Definition 5.4 The degree of blame of setting X to x for ϕ relative to epistemic state (K, Pr), denoted db(K, Pr, X x, ϕ), is X (M, u) K dr((M X x, u), X = x, ϕ) Pr((M, u)). 6. Note that the conference version of this paper (Aleksandrowicz, Chockler, Halpern, & Ivrii, 2014) claimed that the problem was FPD2[log n]-complete. As we show here, this claim was incorrect. THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY For the original HP definition of cause, Chockler and Halpern (2004) show that the complexity of computing the degree of blame is best described using the complexity classes FPΣP 2 || and FPNP || , which consist of all functions that can be computed in polynomial time with parallel (i.e., nonadaptive) queries to a ΣP 2 (respectively, NP) oracle. (For background on these complexity classes, see Jenner & Toran, 1995; Johnson, 1990.) The degree of blame for the original definition of cause is proven to be FPΣP 2 || -complete for general causal models and FPNP || -complete for binary causal models. Since the complexity of computing blame is directly derived from the complexity of computing responsibility, we have the following theorem. Theorem 5.5 The problem of computing blame in recursive causal models is FPΣP 2 || -complete for singleton and general causes in binary and general causal models. Proof outline: The fact that computing blame for singleton causes (and hence also for general causes) is FPΣP 2 || -hard follows directly from the proof of the FPΣP 2 || -completeness of blame for singleton causes, which is the same as the proof for the previous definition of causality (Chockler & Halpern, 2004). The fact that computing blame for general causes (and hence for singleton causes) is in FPΣP 2 || is shown by a parallel computation of the degree of responsibility for all (M, u) K and an application of Theorem 5.3. The degree of blame, which is the expected degree of responsibility taken over the situations (M, u) K according to the probability distribution Pr, can then be computed in polynomial time, given the degree of responsibility for each situation. 6. Conclusion In this paper, we have examined the complexity of computing whether X = x is a cause of ϕ and the degree of responsibility and blame of X = x for ϕ according to the updated definition of causality introduced by Halpern and Pearl (2005b). According to Halpern and Pearl s (2005a) original definition of causality , the complexity of computing whether X = x is a cause of ϕ was shown by Eiter and Lukasiewicz (2002) to be NP-complete for binary models, and ΣP 2 -complete for general models. For the updated model, we show that the complexity rises to DP 2 -complete, both for binary and general models. On the other hand, the complexity of computing the degree of responsibility and blame does not change as we move from the original definition to the updated definition. These results suggest that computing causality will be difficult for practical problems. The obvious question is whether there are special cases of the problem that are of practical interest that are more tractable. An examples of a tractable class of interest for the original definition was given by Chockler, Halpern, and Kupferman (2008). For problems in this class, the original and updated definitions agree, so they are tractable for the updated definition as well. Another example was given by Chockler et al. (2015), who used causal models for the analysis of complex legal cases. These models were shown to be highly modular, with each variable affecting only a small part of the model, thus allowing for a modular computation of the degrees of responsibility and blame. It would be of interest to find other such classes. More recently, Halpern (2015) proposed yet another modification of the HP definition, and showed that checking whether X = x is a cause of ϕ is DP -complete. Alechina, Halpern, and Logan (2016) showed that the problem of determining the degree of responsibility and the degree of blame of X = x for ϕ is FPΣP 2 [log n]-complete, just as for the original and updated definition. Given ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII the existence of three variants of the definition, clearly more work is needed to understand which is most appropriate and when; see (Halpern, 2016) for more on this issue. Acknowledgements A preliminary version of this paper appears in Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI-14), 2014 (Aleksandrowicz et al., 2014). Joseph Halpern was supported in part by NSF grants IIS-0911036 and CCF-1214844, AFOSR grant FA9550-08-1-0438, ARO grant W911NF-14-1-0017, and by the Do D Multidisciplinary University Research Initiative (MURI) program administered by AFOSR under grant FA9550-12-1-0040. Appendix A. Proof of Theorem 4.1 As we observed in the main part of the paper, membership is straightforward, so we focus here on hardness. For hardness, we describe a reduction from the language ΣP 2 (SAT) to LB,1 AC2. In the process, we work with both propositional formulas with propositional variables, and causal formulas that use formulas like X = 1 and X = 0. We can think of X as a propositional variable here, where X = 1 denotes that X is true, and X = 0 denotes that x is false. If ϕ is a propositional formula, let ϕ be the causal formula that results by replacing each occurrence of a propositional variable X by X = 1. Given a CQBF X Y ϕ, consider the tuple (M, u, ψ, A, 0) where M = (U, V, R) is a binary causal model and V = {X0 | X X} {X1 | X X} {Y | Y Y } {A}, where A is a fresh variable that does not appear in X or Y ; for all variables V V, the structural equation is V = U (i.e. all the variables in V are set to the value of U); ψ = ψ1 (ψ2 ψ3) where ψ1, ψ2, ψ3 are the following causal formulas: X X(X0 = X1) ; 7 ψ2 = (A = 1 Y = 1); ψ3 = (A = 1) ϕ[ X/ X1], where ϕ[ X/ X1] is the result of replacing each occurrence of a variable X X by X1). We prove that X Y ϕ = true iff A = 0 is a cause of ψ in (M, u) (which is the case iff (M, u, ψ, A, 0) LB,1 AC2, since AC3 holds trivially for singleton causes). First suppose that X Y ϕ = true. To show that A = 0 is a cause of ψ in (M, u), we prove that AC1 and AC2 hold. Clearly AC1 holds: the fact that (M, u) |= A = 0 is immediate, given the equation for A, and (M, u) |= ψ since (M, u) |= ψ1, again by the definition of F. For AC2, let 7. As usual, we take X0 = X1 to be an abbreviation for the causal formula (X0 = 1 X1 = 0) (X0 = 0 X1 = 1). THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY W = V {A} and define w as follows. Let τ be an assignment to the variables in X for which Y ϕ = true. Using w(X) to denote the value of X according to w, we define w(Xτ(X)) = 1 for all X X; w(X1 τ(X)) = 0 for all X X; and w(Y ) = 1 for all Y Y . For AC2(a), note that (M, u) |= [A 1, W w] ψ1 (since w assigns different values to X0 and X1 for all X X) and, since w(Y ) = 1 for all Y Y , we have that (M, u) |= [A 1, W w] ψ2, so (M, u) |= [A 1, W w] ψ. Thus, AC2(a) holds. It now remains to show that AC2(b) holds. Fix W W. We must show that (M, u) |= [A 0, W w]ψ. (The condition for all Z Z {A} is vacuous in this case, since Z = {A}.) Since the definition of M guarantees that (M, u) |= [A 0, W w]ψ iff (M, u) |= [ W w]ψ, we focus on the latter condition from here on in. If (M, u) |= [ W w]ψ1, we are done. So suppose that (M, u) |= [ W w] ψ1; that is, (M, u) |= [ W w] X X (X0 = X1) It follows that, for each variable X X, we have that (M, u) |= [ W w](X1 = τ(X)). To see this, note that if τ(X) = 1, then we must have X1 W ; otherwise, we would have (M, u) |= [ W w](X1 = 0 X0 = 0), contradicting (1). And if τ(X) = 0, then since w(X1) = 0, we must have (M, u) |= [ W w](X1 = 0), whether or not X1 W , so (M, u) |= [ W w]ψ3. Clearly (M, u) |= [ W w]ψ2. It follows that (M, u) |= [ W w]ψ, showing that AC2(b) holds. Finally, we must show that if A = 0 is a cause of ψ in (M, u) then X Y ϕ = true. So suppose that A = 0 is a cause of ψ in (M, u). Then there exists a witness ( W, w, a). Since we are considering binary models, we must have a = 1, so we have (M, u) |= [A 1, W w] ψ. (2) This implies that (M, u) |= [A 1, W w] ψ1, so (M, u) |= [A 1, W w] X X (X0 = X1) Define τ so that τ(X) = b, where b {0, 1} is the unique value for which (M, u) |= [A 1, W w]Xb = 1. It also follows from (2) that (M, u) |= [A 1, W w] (ψ2 ψ3). Hence, (M, u) |= [A 1, W w] ψ2 or (M, u) |= [A 1, W w] ψ3. ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII Since (M, u) |= [A 1, W w]ψ3 (because A is assigned to 1), we have that (M, u) |= [A 1, W w] ψ2. Since ψ2 = (A = 1 Y = 1), we have that (M, u) |= [A 1, W w]( Y = 1). It follows that Y W and w(Y ) = 1 for all y Y . Now let ν be an arbitrary assignment to X and Y such that ν| X = τ. It suffices to show that ϕ is true under assignment ν. Let W = W {Y Y | ν(Y ) = 0}; that is, W contains all the variables Xb that are in W, and all the variables Y Y for which ν(Y ) = 1. By AC2(b), it follows that (M, u) |= [ W w]ψ. Since W contains all the variables Xb in W, we have that (M, u) |= [ W w] ψ1. Thus, we must have that (M, u) |= [ W w]ψ3. Since (M, u) |= [ W w](A = 0), it follows that (M, u) |= [ W w]ϕ[ X/ X1]. Note that for Y Y , w(Y ) = 1 iff ν(Y ) = 1; moreover, w(X1) = 1 iff τ(X) = 1 iff ν(X) = 1. Thus, the fact that (M, u) |= [ W w]ψ3 implies that ϕ is satisfied by ν, so we are done. This completes the proof of the theorem. For future reference, we note that our argument proved more than was strictly necessary, since we showed that X Y ϕ = true iff A = 0 is a cause of ψ in (M, u). Appendix B. Proof of Theorem 4.3 Again, as we observed in the main part of the paper, membership is straightforward, so we focus on hardness. We describe a reduction from the language Π2(SAT) to LB AC3, which suffices to prove the result. The argument is similar in spirit to that for Theorem 4.1. Given a CQBF Y Xϕ, consider the tuple (M, u, ψ, A1, A2 , 0, 0 ) where M = (U, V, R) is a binary causal model and V = X {Y 0 | Y Y } {Y 1 | Y Y } {A1, A2, S}, where A1, A2, and S are fresh variables; the structural equations for A1 and A2 are A1 = S and A2 = S, and, for all other variables V V, the equation is V = U; ψ = ψ1 (ψ2 ψ3) (S = 0) where Y Y (Y 0 = Y 1) ; ψ2 = ( A = 1 X = 1); ψ3 = (A1 = A2) ϕ[ Y / Y 1]. We now prove that the following are equivalent: (a) Y Xϕ = true; (b) (M, u, ψ, A, 0) is in LB AC3; (c) A = 0 is a cause of ψ in (M, u). THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY The equivalence of (a) and (b) is all we need to prove Theorem 4.3, but the stronger result is useful for proving Theorem 4.4. We first show that (a) implies (c). So suppose that Y Xϕ = true. To show that A = 0 is a cause of ψ in (M, u), we must prove that AC1, AC2, and AC3 hold. For AC1, since (M, u) |= Y 0 = Y 1 = 0, we have (M, u) |= ψ1, so (M, u) |= ( A = 0) ψ. For AC2, let W = {S} X Y 0 Y 1(= V {A1, A2}), and define w as follows: w(X) = 1 for all X X; w(Y 1) = 1 and w(Y 0) = 0 for all Y Y . We have that (M, u) |= [ A 1, W w] Y Y (Y 0 = Y 1) ( A = 1 X = 1) so (M, u) |= [ A 1, W w]( ψ1 ψ2 (S = 1)). It follows that (M, u) |= [ A 1, W w] ψ, showing that AC2(a) holds. For AC2(b), let W be a subset of W. We must show that (M, u) |= [ A 0, W w]ψ. Clearly (M, u) |= [ A 0, W w]( A = 0). It is immediate from the definitions of ψ2 and ψ3 that ( A = 0) (ψ2 ψ3) is valid, so (M, u) |= [ W w](ψ2 ψ3). Hence, (M, u) |= [ W w]ψ, as desired. To show that AC3 holds, we need to show that neither A1 = 0 nor A2 = 0 is a cause of ψ in (M, u). We prove that A1 = 0 is not a cause of ψ in (M, u); the argument for A2 = 0 not being a cause is identical. It suffices to prove that AC2 does not hold for A1 = 0. So suppose by way of contradiction that ( W, w, 1) is a witness for A1 = 0 being a cause of ψ in (M, u). Since AC2(a) holds, there must exist W and w such that (M, u) |= [A1 1, W w]( ψ1 ( ψ2 ψ3) (S = 1)). (3) Thus, S W and w(S) = 1 (for otherwise (M, u) |= [A1 1, W w](S = 0)). Moreover, since (M, u) |= [A1 1, W w] ψ1, for all Y Y , either Y 0 W and w(Y 0) = 1 or Y 1 W and w(Y 1) = 1, and it is not the case that both Y 0 and Y 1 are in W and w(Y 0) = w(Y 1). Now consider A2. There are three possibilities: (i) A2 W and w(A2) = 0; (ii) A2 W and w(A2) = 1; (iii) A2 / W. We show that we get a contradiction in each case. If (i) holds, note that since (M, u) |= [A1 1, W w](A2 = 0), ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII it follows that (M, u) |= [A1 1, W w]ψ2, so by (3), (M, u) |= [A1 1, W w] ψ3. Since (M, u) |= [A1 1, W w](A1 = A2), it follows that (M, u) |= [A1 1, W w]ϕ[ Y / Y 1]. Let Z = and let W = W {A2}. We show that (M, u) |= [A1 0, W w] ψ, so that AC2(b) does not hold. First observe that (M, u) |= [A1 0, W w](A1 = A2). Since S and all the variables in X, Y 0, and Y 1 are in both W and W, it follows from (3) that (M, u) |= [A1 0, W w]( ψ1 ϕ[ Y / Y 1] (S = 1)). Thus, (M, u) |= [A1 0, W w] ψ, and AC2(b) does not hold. If (ii) or (iii) hold, define an assignment ν to the variables in Y by taking ν(Y ) = 1 if Y 1 W and w(Y 1) = 1 and ν(Y ) = 0 if Y 0 W and w(Y 0) = 1. (As we observed above, exactly one of these two cases occurs, so ν is well defined.) By assumption, Y Xϕ = true, so there exists an assignment τ to the variables that makes ϕ true if τ(Y ) = ν(Y ) for all Y Y . We again show that AC2(b) does not hold. Let Z = and let W = W {X : τ(X) = 0}. Since S W and w(S) = 1, it is easy to see that in both case (b) and (c), we have (M, u) |= [A1 0, W w](A1 = A2). (In case (ii), this is because w(A2) = 1; in case (iii), this is because we have the equation A2 = S and w(S) = 1.) The definition of W ensures that (M, u) |= [A1 0, W w]ϕ[ Y / Y 1], so that (M, u) |= [A1 0, W w] ψ3. Hence, (M, u) |= [A1 0, W w] ψ, again showing that AC2(b) does not hold. We conclude that AC3 holds for A. This competes the proof that (a) implies (c). The fact that (c) implies (b) is immediate. It remains to show that (b) implies (a). So suppose that (M, u, ψ, A, 0) is in LB AC3. We must show that Y Xϕ( X, Y ) = true. Let ν be some assignment to Y . Let W = {S} X Y 0 Y 1 and define w as follows: w(X) = 1 for all X X; w(Y ν(y)) = 1 and w(Y 1 ν(y)) = 0 for all Y Y . Since AC3 holds, A1 0 cannot be a cause of ψ in (M, u) with witness ( W, w, 1). It is straightforward to check that (M, u) |= [A1 1, W w] ψ, using the fact that w(S) = 1. Hence, AC2(a) holds for A1 0. AC3 holds trivially, and we have already observed that AC1 holds. Thus, AC2(b) cannot hold for A2 0, that is, there exist W W and Z {A2} such that (M, u) |= [A1 0, W w, Z z ] ψ. It follows that S W and w(S) = 1; and either Y 0 W and w(Y 0) = 1 or Y 1 W and w(Y 1) = 1, and it is not the case that both Y 0 and Y 1 are in W and w(Y 0) = w(Y 1). Since (M, u) |= [A1 0, W w, Z z ]ψ2, it must be the case that (M, u) |= [A1 0, W w, Z z ] ψ3. This, in turn, implies that (M, u) |= [A1 0, W w, Z z ] ϕ[ Y / Y 1]. Now define τ(X) = 1 iff X W . It is immediate that τ satisfies ϕ if the values of Y are assigned according to ν. It follows that Y Xϕ( X, Y ) = true, as desired. THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY Appendix C. Proof of Theorem 4.4 As mentioned in the proof outline, membership is straightforward, so we focus on hardness. Note that it suffices to prove that LBcause is DP 2 -hard; the fact that Lcause is DP 2 -hard follows immediately. In order to prove that LBcause is DP 2 -hard, first consider the language SAT2 = ΣP 2 (SAT) ΠP 2 (SAT). By Lemma 3.2, SAT2 is DP 2 -complete. Thus, it suffices to reduce SAT2 to LBcause. We do this by combining the constructions of Theorems 4.1 and 4.3 as follows. Given a pair X Y ϕ, Y X ϕ , we assume without loss of generality that the formulas in the pair use disjoint sets of variables. For simplicity, we use primed versions of variable names for the second component of the pair. Let M, u, ψ, A, 0 be the tuple constructed in the proof of Theorem 4.1, and let M , u , ψ , A 1, A 2 , 0, 0 be the tuple constructed in the proof of Theorem 4.3 (using primed variable names). Recall that the proof of Theorem 4.1 showed that X Y ϕ = true iff A = 0 is a cause of ψ in (M, u), and the proof of Theorem 4.3 showed that Y X ϕ = true iff A = 0 is a cause of ψ in (M , u). The reduction from SAT2 constructs a model M that can be viewed as the union of M and M . That is, the set of endogenous variables in M is the union of the endogenous variables in M and in M , and similarly for the exogenous variables, and the equations for M consist of the union of the equations for M and M . We claim that A = 0 A = 0 is a cause of ψ ψ in (M , u) iff X Y ϕ, Y X ϕ SAT2. (Note that (M , u) |= ψ ψ ; thus, to make ψ ψ false, both ψ and ψ must be false.) First suppose that X Y ϕ, Y X ϕ SAT2. Then, A = 0 is a cause of ψ in (M, u) and A = 0 is a cause of ψ in (M , u ). It easily follows that A = 0 A = 0 is cause of ψ ψ in (M , u). The other direction is similar. If A = 0 A = 0 is a cause of ψ ψ in (M , u), then, since the non-primed variables affect only the value of ψ, A = 0 is a cause of ψ in (M, u), and similarly, A = 0 is a cause of ψ in (M , u). It follows from Theorem 4.1 and 4.3 that X Y ϕ ΣP 2 (SAT) and Y X ϕ ΠP 2 (SAT), therefore X Y ϕ, Y X ϕ SAT2. Alechina, N., Halpern, J. Y., & Logan, B. (2016). Causality, responsibility, and blame in team plans. Unpublished manuscript. Aleksandrowicz, G., Chockler, H., Halpern, J. Y., & Ivrii, A. (2014). The computational complexity of structure-based causality. In Proc. Twenty-Eighth National Conference on Artificial Intelligence (AAAI 14), pp. 974 980. Beer, I., Ben-David, S., Chockler, H., Orni, A., & Trefler, R. J. (2012). Explaining counterexamples using causality. Formal Methods in System Design, 40(1), 20 40. Chang, R., & Kadin, J. (1996). The Boolean hierarchy and the polynomial hierarchy: a closer connection. SIAM Journal on Computing, 25(2), 340 354. Chockler, H., Fenton, N. E., Keppens, J., & Lagnado, D. A. (2015). Causal analysis for attributing responsibility in legal cases. In Proceedings of the 15th International Conference on Artificial Intelligence and Law, ICAIL, pp. 33 42. ACM. Chockler, H., Halpern, J. Y., & Kupferman, O. (2008). What causes a system to satisfy a specification?. ACM Transactions on Computational Logic, 9(3). ALEKSANDROWICZ, CHOCKLER, HALPERN, & IVRII Chockler, H., & Halpern, J. (2004). Responsibility and blame: a structural-model approach. Journal of Artificial Intelligence Research (JAIR), 22, 93 115. Collins, J., Hall, N., & Paul, L. A. (Eds.). (2004). Causation and Counterfactuals. MIT Press, Cambridge, MA. Eiter, T., & Lukasiewicz, T. (2002). Complexity results for structure-based causality. Artificial Intelligence, 142(1), 53 89. Garey, M., & Johnson, D. S. (1979). Computers and Intractability: A Guide to the Theory of NPcompleteness. W. Freeman and Co., San Francisco. Gerstenberg, T., & Lagnado, D. (2010). Spreading the blame: the allocation of responsibility amongst multiple agents. Cognition, 115, 166 171. Hall, N. (2004). Two concepts of causation. In Collins, J., Hall, N., & Paul, L. A. (Eds.), Causation and Counterfactuals. MIT Press, Cambridge, MA. Halpern, J. Y. (2008). Defaults and normality in causal structures. In Principles of Knowledge Representation and Reasoning: Proc. Eleventh International Conference (KR 08), pp. 198 208. Halpern, J. Y. (2015). A modification of the Halpern-Pearl definition of causality. In Proc. 24th International Joint Conference on Artificial Intelligence (IJCAI 2015), pp. 3022 3033. Halpern, J. Y. (2016). Actual Causality. MIT Press, Cambridge, MA. Halpern, J. Y., & Pearl, J. (2001). Causes and explanations: A structural-model approach. Part I: Causes. In Proc. Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI 2001), pp. 194 202. Halpern, J. Y., & Pearl, J. (2005a). Causes and explanations: a structural-model approach. Part I: Causes. British Journal for Philosophy of Science, 56(4), 843 887. Halpern, J., & Pearl, J. (2005b). Causes and explanations: A structural-model approach. part 1: Causes. British Journal for Philosophy of Science, 56(4), 843 887. Hopkins, M. (2001). A proof of the conjunctive cause conjecture. Unpublished manuscript. Hopkins, M., & Pearl, J. (2003). Clarifying the usage of structural models for commonsense causal reasoning. In Proc. AAAI Spring Symposium on Logical Formalizations of Commonsense Reasoning. Hume, D. (1739). A Treatise of Human Nature. John Noon, London. Jenner, B., & Toran, J. (1995). Computing functions with parallel queries to NP. Theoretical Computer Science, 141, 175 193. Johnson, D. S. (1990). A catalog of complexity classes. In Leeuwen, J. v. (Ed.), Handbook of Theoretical Computer Science, Vol. A, chap. 2. Elsevier Science. Krentel, M. (1988). The complexity of optimization problems. Journal of the CSS, 36, 490 509. Lagnado, D. A., Gerstenberg, T., & Zultan, R. (2013). Causal responsibility and counterfactuals. Cognitive Science, 37, 1036 1073. Papadimitriou, C. H. (1984). The complexity of unique solutions. Journal of ACM, 31, 492 500. THE COMPUTATIONAL COMPLEXITY OF STRUCTURE-BASED CAUSALITY Papadimitriou, C. H., & Yannakakis, M. (1984). The complexity of facets (and some facets of complexity). J. Comput. Syst. Sci., 28(2), 244 259. Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge University Press, New York. Sipser, M. (2012). Introduction to Theory of Computation (third edition). Thomson Course Technology, Boston. Stockmeyer, L. J. (1977). The polynomial-time hierarchy. Theoretical Computer Science, 3, 1 22. Wrathall, C. (1976). Complete sets and the polynomial-time hierarchy. Theoretical Computer Science, 3(1), 23 33.