# the_computational_complexity_of_structurebased_causality__eef1344c.pdf

The Computational Complexity of Structure-Based Causality

Gadi Aleksandrowicz IBM Research Lab, Haifa, Israel gadia@il.ibm.com

Hana Chockler Department of Informatics, King s College, London, UK hana.chockler@kcl.ac.uk

Joseph Y. Halpern Computer Science Department, Cornell University, Ithaca, NY, U.S.A. halpern@cs.cornell.edu

Alexander Ivrii IBM Research Lab, Haifa, Israel alexi@il.ibm.com

Halpern and Pearl introduced a deﬁnition of actual causality; Eiter and Lukasiewicz showed that computing whether X = x is a cause of Y = y is NP-complete in binary models (where all variables can take on only two values) and ΣP 2 - complete in general models. In the ﬁnal version of their paper, Halpern and Pearl slightly modiﬁed the deﬁnition of actual cause, in order to deal with problems pointed by Hopkins and Pearl. As we show, this modiﬁcation has a nontrivial impact on the complexity of computing actual cause. To characterize the complexity, a new family DP k , k = 1, 2, 3, . . ., of complexity classes is introduced, which generalizes the class DP introduced by Papadimitriou and Yannakakis (DP is just DP 1 ). We show that the complexity of computing causality under the updated deﬁnition is DP 2 -complete. Chockler and Halpern extended the deﬁnition of causality by introducing notions of responsibility and blame. The complexity of determining the degree of responsibility and blame using the original deﬁnition of causality was completely characterized. Again, we show that changing the deﬁnition of causality affects the complexity, and completely characterize it using the updated deﬁnition.

1 Introduction

There have been many attempts to deﬁne causality going back to Hume (1739), and continuing to the present (see, for example, (Collins, Hall, and Paul 2004; Pearl 2000) for some recent work). The standard deﬁnitions of causality are based on counterfactual reasoning. In this paper, we focus on one such deﬁnition, due to Halpern and Pearl, that has proved quite inﬂuential recently. The deﬁnition was originally introduced in 2001 (Halpern and Pearl 2001), but then modiﬁed in the ﬁnal journal version (Halpern and Pearl 2005) to deal with problems pointed out by Hopkins and Pearl (2003). (For ease of reference, we call these deﬁnitions the original HP deﬁnition and the updated HP deﬁnition in the sequel.) In general, what can be a cause in both the original HP deﬁnition and

Copyright c 2014, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

the updated deﬁnition is a conjunction of the form X1 x1 . . . Xk xk, abbreviated X x; what is caused can be an arbitrary Boolean combination ϕ of formulas of the form Y = y. This should be thought of as saying that setting X1 to x1 and . . . and setting Xk to xk results in ϕ being true. As shown by Eiter and Lukasiewicz (2002) and Hopkins (2001), under the original HP deﬁnition, we can always take causes to be single conjuncts. However, as shown by Halpern (2008), this is not the case for the updated HP deﬁnition. Using the fact that causes can be taken to be single conjuncts, Eiter and Lukasiewicz(2002) showed that deciding causality (that is, deciding whether X = x is a cause of ϕ) is NP-complete in binary models (where all variables can take on only two values) and ΣP 2 -complete in general models. As we show here, this is no longer the case for the updated HP deﬁnition. Indeed, we completely characterize the complexity of causality for the updated HP deﬁnition. To do so, we introduce a new family of complexity classes that may be of independent interest. Papadimitriou and Yannakakis (1984) introduced the complexity class DP , which consists of all languages L3 such that there exists a language L1 in NP and a language L2 in co-NP such that L3 = L1 L2. We generalize this by deﬁning DP k to consist of all languages L3 such that there exists a language L1 ΣP k and a language L2 ΠP k such that L3 = L1 L2. Since ΣP 1 is NP and ΠP 1 is co-NP, DP 1 is Papadimitriou and Yannakakis s DP . We then show that deciding causality under the updated HP deﬁnition is DP 2 complete. Papadimitriou and Yannakakis (1984) showed that a number of problems of interest were DP complete, both for binary and general causal models. To the best of our knowledge, this is the ﬁrst time that a natural problem has been shown to be complete for DP 2 . Although, in general, causes may not be single conjuncts, as observed by Halpern (2008), in many cases (in particular, in all the standard examples studied in the literature), they are. In an effort to understand the extent to which the difﬁculty in deciding causality stems from the fact that causes may require several conjuncts, we consider what we call the

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence

singleton cause problem; that is, the problem of deciding if X = x is a cause of ϕ (i.e., where there is only a single conjunct in the cause). We show that the singleton cause problem is simpler than the general causality problem (unless the polynomial hierarchy collapses): it is ΣP 2 complete for both binary and general causal models. Thus, if we restrict to singleton causes, the complexity of deciding causality in general models is the same under the original and the updated HP deﬁnition, but in binary models, it is still simpler under the original HP deﬁnition.

Causality is a 0 1 concept; X = x is either a cause of ϕ or it is not. Now consider two voting scenarios: in the ﬁrst, Mr. G beats Mr. B by a vote of 11 0. In the second, Mr. G beats Mr. B by a vote of 6 5. According to both the original and the updated HP deﬁnition, all the people who voted for Mr. G are causes of him winning. While this does not seem so unreasonable, it does not capture the intuition that each voter for Mr. G is more critical to the victory in the case of the 6 5 vote than in the case of the 11 0 vote. The notion of degree of responsibility, introduced by Chockler and Halpern (2004), does so. The idea is that the degree of responsibility of X = x for ϕ is 1/(k+1), where k is the least number of changes that have to be made in order to make X = x critical. In the case of the 6 5 vote, no changes have to be made to make each voter for Mr. G critical for Mr. G s victory; if he had not voted for Mr. G, Mr. G would not have won. Thus, each voter has degree of responsibility 1 (i.e., k = 0). On the other hand, in the case of the 11 0 vote, for a particular voter to be critical, ﬁve other voters have to switch their votes; thus, k = 5, and each voter s degree of responsibility is 1/6. This notion of degree of responsibility has been shown to capture (at a qualitative level) the way people allocate responsibility (Gerstenberg and Lagnado 2010; Lagnado, Gerstenberg, and Zultan 2013). Chockler and Halpern further extended the notion of degree of responsibility to degree of blame. Formally, the degree of blame is the expected degree of responsibility. This is perhaps best understood by considering a ﬁring squad with ten excellent marksmen. Only one of them has live bullets in his riﬂe; the rest have blanks. The marksmen do not know which of them has the live bullets. The marksmen shoot at the prisoner and he dies. The only marksman that is the cause of the prisoner s death is the one with the live bullets. That marksman has degree of responsibility 1 for the death; all the rest have degree of responsibility 0. However, each of the marksmen has degree of blame 1/10.The complexity of determining the degree of responsibility and blame using the original deﬁnition of causality was completely characterized (Chockler and Halpern 2004; Chockler, Halpern, and Kupferman 2008). Again, we show that changing the deﬁnition of causality affects the complexity, and completely characterize the complexity of determining the degree of responsibility and blame with the updated deﬁnition. The rest of this paper is organized as follows. In Section 2, we review the relevant deﬁnitions of causality. In Section 3, we brieﬂy review the relevant deﬁnitions from complexity theory and deﬁne the complexity classes DP k . In Section 4

we prove our results on complexity of causality.1

2 Causal Models and Causality: A Review In this section, we review the details of Halpern and Pearl s deﬁnition of causal models and causality, describing both the original deﬁnition and the updated deﬁnition. This material is largely taken from (Halpern and Pearl 2005), to which we refer the reader for further details.

2.1 Causal models A signature is a tuple S = U, V, R , where U is a ﬁnite set of exogenous variables, V is a ﬁnite set of endogenous variables, and R associates with every variable Y U V a ﬁnite nonempty set R(Y ) of possible values for Y . Intuitively, the exogenous variables are ones whose values are determined by factors outside the model, while the endogenous variables are ones whose values are ultimately determined by the exogenous variables. A causal model over signature S is a tuple M = S, F , where F associates with every endogenous variable X V a function FX such that FX : ( U UR(U) ( Y V\{X}R(Y ))) R(X). That is, FX describes how the value of the endogenous variable X is determined by the values of all other variables in U V. If R(Y ) contains only two values for each Y U V, then we say that M is a binary causal model. We can describe (some salient features of) a causal model M using a causal network. A causal network is a graph with nodes corresponding to the random variables in V and an edge from a node labeled X to one labeled Y if FY depends on the value of X. Intuitively, variables can have a causal effect only on their descendants in the causal network; if Y is not a descendant of X, then a change in the value of X has no affect on the value of Y . For ease of exposition, we restrict attention to what are called recursive models. These are ones whose associated causal network is a directed acyclic graph (that is, a graph that has no cycle of edges). Actually, it sufﬁces for our purposes that, for each setting u for the variables in U, there is no cycle among the edges of the causal network. We call a setting u for the variables in U a context. It should be clear that if M is a recursive causal model, then there is always a unique solution to the equations in M, given a context. The equations determined by {FX : X V} can be thought of as representing processes (or mechanisms) by which values are assigned to variables. For example, if FX(Y, Z, U) = Y + U (which we usually write as X = Y + U), then if Y = 3 and U = 2, then X = 5, regardless of how Z is set. This equation also gives counterfactual information. It says that, in the context U = 4, if Y were 4, then X would be 8, regardless of what value X and Z actually take in the real world. That is, if U = 4 and the value of Y were forced to be 4 (regardless of its actual value), then the value of X would be 8. While the equations for a given problem are typically obvious, the choice of variables may not be. Consider the following example (due to Hall (2004)), showing that the

1Missing proof details can be found at http://www.cs.cornell.edu/home/halpern/papers/newcause.pdf.

choice of variables inﬂuences the causal analysis. Suppose that Suzy and Billy both pick up rocks and throw them at a bottle. Suzy s rock gets there ﬁrst, shattering the bottle. Since both throws are perfectly accurate, Billy s would have shattered the bottle had Suzy not thrown. In this case, a naive model might have an exogenous variable U that encapsulates whatever background factors cause Suzy and Billy to decide to throw the rock (the details of U do not matter, since we are interested only in the context where U s value is such that both Suzy and Billy throw), a variable ST for Suzy throws (ST = 1 if Suzy throws, and ST = 0 if she doesn t), a variable BT for Billy throws, and a variable BS for bottle shatters. In the naive model, whose graph is given in Figure 1, BS is 1 if one of ST and BT is 1.

p p S Sw / S Sw /BS

Figure 1: A naive model for the rock-throwing example.

This causal model does not distinguish between Suzy and Billy s rocks hitting the bottle simultaneously and Suzy s rock hitting ﬁrst. A more sophisticated model might also include variables SH and BH, for Suzy s rock hits the bottle and Billy s rock hits the bottle. Clearly BS is 1 iff one of SH and BH is 1. However, now, SH is 1 if ST is 1, and BH = 1 if BT = 1 and SH = 0. Thus, Billy s throw hits if Billy throws and Suzy s rock doesn t hit. This model is described by the following graph, where we implicitly assume a context where Suzy throws ﬁrst, so there is an edge from SH to BH, but not one in the other direction (and omit the exogenous variable).

p p ? ? - S Sw /BS

Figure 2: A better model for the rock-throwing example.

Given a causal model M = (S, F), a (possibly empty) vector X of variables in V, and a vector x of values for the variables in X, we deﬁne a new causal model, denoted M X x, which is identical to M, except that the equation for the variables X in F is replaced by X = x. Intuitively, this is the causal model that results when the variables in X are set to x by some external action that affects only the variables in X (and overrides the effects of the causal equations). For example, if M is the more sophisticated model for the rock-throwing example, then MST 0 is the model where Suzy doesn t throw. Given a signature S = (U, V, R), a formula of the form X = x, for X V and x R(X), is called a primitive event. A basic causal formula has the form [Y1 y1, . . . , Yk yk]ϕ, where ϕ is a Boolean combination of

primitive events; Y1, . . . , Yk are distinct variables in V; and yi R(Yi). Such a formula is abbreviated as [ Y y]ϕ. The special case where k = 0 is abbreviated as ϕ. Intuitively, [Y1 y1, . . . , Yk yk]ϕ says that ϕ holds in the counterfactual world that would arise if Yi is set to yi, for i = 1, . . . , k. A causal formula is a Boolean combination of basic causal formulas. A causal formula ϕ is true or false in a causal model, given a context. We write (M, u) |= ϕ if ϕ is true in causal model M given context u. (M, u) |= [ Y y](X = x) if the variable X has value x in the unique (since we are dealing with recursive models) solution to the equations in M Y y in context u (i.e., the unique vector of values for the exogenous variables that simultaneously satisﬁes all equations F Y y Z , Z V Y , with the variables in U set to u). We extend the deﬁnition to arbitrary causal formulas in the obvious way.

2.2 Causality We now review the updated HP deﬁnition of causality.

Deﬁnition 2.1 X = x is a cause of ϕ in (M, u) if the following three conditions hold:

AC1. (M, u) |= ( X = x) ϕ.

AC2. There exist a partition ( Z, W) of V with X Z and some setting ( x , w) of the variables in ( X, W) such that if (M, u) |= Z = z for Z Z, then

(a) (M, u) |= [ X x , W w] ϕ.

(b) (M, u) |= [ X x, W w, Z z ]ϕ for all subsets Z of Z \ X and all subsets W of W, where we abuse notation and write W w to denote the assignment where the variables in W get the same values as they would in the assignment W w, and similarly for Z z . That is, setting any subset W of W to the values in w should have no effect on ϕ as long as X has the value x, even if all the variables in an arbitrary subset of Z are set to their original values in the context u. AC3. ( X = x) is minimal; no subset of X satisﬁes AC2.

If X is a singleton, then X = x is said to be a singleton cause of ϕ in (M, u). AC1 just says that A cannot be a cause of B unless both A and B are true. The core of this deﬁnition lies in AC2. Informally, the variables in Z should be thought of as describing the active causal process from X to ϕ. These are the variables that mediate between X and ϕ. AC2(a) is reminiscent of the traditional counterfactual criterion, according to which X = x is a cause of ϕ if changing the value of X results in ϕ being false. However, AC2(a) is more permissive than the traditional criterion; it allows the dependence of ϕ on X to be tested under special structural contingencies, in which the variables W are held constant at some setting w. AC2(b) is an attempt to counteract the permissiveness of AC2(a) with regard to structural contingencies. Essentially, it ensures that X alone sufﬁces to bring about the change

from ϕ to ϕ; setting W to w merely eliminates spurious side effects that tend to mask the action of X. To understand the role of AC2(b), consider the rockthrowing example again. Let M be the model in Figure 1, and let u be the context where both Suzy and Billy throw. It is easy to see that both Suzy and Billy are causes of the bottle shattering in (M, u): Let Z = {ST, BS}, and consider the structural contingency where Billy doesn t throw (BT = 0). Clearly (M, U) |= [ST 0, BT 0](BS = 0) and (M, u) |= [ST 1, BT 0](BS = 1), so Suzy is a cause of the bottle shattering. A symmetric argument shows that Billy is also a cause. But now consider the model M described in Figure 2; again, u is the context where both Suzy and Billy throw. It is still the case that Suzy is a cause of the bottle shattering in (M , u). We can take W = {BT} and again consider the contingency where Billy doesn t throw. However, Billy is not a cause of the bottle shattering in (M , u). For suppose that we now take W = {ST} and consider the contingency where Suzy doesn t throw. Clearly AC2(a) holds, since if Billy doesn t throw (under this contingency), then the bottle doesn t shatter. However, AC2(b) does not hold. Since BH Z, if we set BH to 0 (its original value), then AC2(b) would require that (M , u) |= [BT 1, ST 0, BH 0](BS = 1), but this is not the case. Similar arguments show that no other choice of ( Z, W) makes Billy s throw a cause of the bottle shattering in (M , u). The original HP deﬁnition differs from the updated definition in only one respect. Rather than requiring that (M, u) |= [ X x, W w, Z z ]ϕ for all subsets W of W, it was required to hold only for W. That is, the following condition was used instead of AC2(b).

AC2(b ) (M, u) |= [ X x, W w, Z z ]ϕ for all subsets Z of Z. The requirement for AC2(b) to hold for all subsets of W in the updated deﬁnition prevents situations where W conceals other causes for ϕ . The role of this requirement is perhaps best understood by considering the following example, due to Hopkins and Pearl (2003) (the description is taken from (Halpern and Pearl 2005)): Suppose that a prisoner dies either if A loads B s gun and B shoots, or if C loads and shoots his gun. Taking D to represent the prisoner s death and making the obvious assumptions about the meaning of the variables, we have that D = (A B) C. Suppose that in the actual context u, A loads B s gun, B does not shoot, but C does load and shoot his gun, so that the prisoner dies. That is, A = 1, B = 0, and C = 1. Clearly C = 1 is a cause of D = 1. We would not want to say that A = 1 is a cause of D = 1, given that B did not shoot (i.e., given that B = 0). However, with AC2(b ), A = 1 is a cause of D = 1. For we can take W = {B, C} and consider the contingency where B = 1 and C = 0. It is easy to check that AC2(a) and AC2(b ) hold for this contingency, so under the original HP deﬁnition, A = 1 is a cause of D = 1. However, AC2(b) fails in this case, since (M, u) |= [A 1, C 0](D = 0). The key point is that AC2(b) says that for A = 1 to be a cause of D = 1, it must

be the case that D = 0 if only some of the values in W are set to w. That means that the other variables get the same value as they do in the actual context; in this case, by setting only A to 1 and leaving B unset, B takes on its original value of 0, in which case D = 0. AC2(b ) does not consider this case. Using AC2(b) rather than AC2(b ) has been shown to have a signiﬁcant beneﬁt (and to lead to more intuitive results) when causality is applied to program veriﬁcation, with the goal of understanding what in the code is the cause of a program not satisfying its speciﬁcation (Beer et al. 2012).

3 Relevant Complexity Classes In this section, we brieﬂy recall the deﬁnitions of the complexity classes that we need for our results, and deﬁne the complexity class Dk 2. Recall that the polynomial hierarchy is a hierarchy of complexity classes that generalize the classes NP and co NP. Let ΣP 1 = NP and ΠP 1 = co-NP. For i > 1, deﬁne ΣP i = NPΣP i 1 and ΠP i = (co-NP)ΣP i 1, where, in general, XY denotes the class of problems solvable by a Turing machine in class A augmented with an oracle for a problem complete for class B. (See (Meyer and Stockmeyer 1972; Stockmeyer 1977) for more details and intuition.) We now deﬁne the classes DP k as follows. Deﬁnition 3.1 For k = 1, 2, . . .,

DP k = {L : L1, L2 : L1 ΣP k , L2 ΠP k , L = L1 L2}.

For k = 1, the class DP 1 is the well-known complexity class DP , deﬁned by Papadimitriou and Yannakakis (1984). It contains exact problems such as the language of pairs G, k , where G is a graph that has a maximal clique of size exactly k. As usual, we say that a language L is DP k complete if it is in DP k and is the hardest language in DP k , in the sense that there is a polynomial time reduction from any language L DP k to L. Recall that a quantiﬁed Boolean formula (QBF) is a generalization of a propositional formula, where some propositional variables are quantiﬁed. Thus, for example, x y(x y) is a QBF. A closed QBF (CQBF) is one where there are no free propositional variables. A CQBF is either true or false, independent of the truth assignment. The canonical languages complete for Σk 2 and Πk 2 consist of the CQBFs with k alternations of quantiﬁers starting with (resp., ) that are true. In particular, let

ΣP 2 (SAT) = { X Y ϕ | X Y ϕ is a CQBF, X Y ϕ = true} ΠP 2 (SAT) = { X Y ϕ | X Y ϕ is a CQBF, X Y ϕ = true}.

ΣP 2 (SAT) is complete for ΣP 2 and ΠP 2 (SAT) is complete for ΠP 2 (Wrathall 1976). The following lemma provides a useful condition sufﬁcient for a language to be DP k -complete.

Lemma 3.2 If L1 is ΣP k -complete and L2 is ΠP k -complete, then L3 = L1 L2 is DP k -complete.

Proof: The fact that L3 is in DP k is immediate from the deﬁnition of DP k . For hardness, let L 3 be a language in DP k . Then there exist L 1 and L 2 such that L 1 ΣP k , L 2 ΠP k , and L = L 1 L 2. Let f be a polynomial-time reduction from L 1 to L1, and let g be a polynomial-time reduction from L 2 to L2 (the existence of such reductions f and g follows from the fact that L1 and L2 are ΣP k -complete and ΠP k -complete, respectively). Then, f, g is a polynomialtime reduction from L 3 to L3, as required.

Essentially the same argument shows that if L1 is ΣP k -hard and L2 is ΠP k -hard, then L3 = L1 L2 is DP k -hard. Determining whether X = x is a cause of ϕ in (M, u) is a decision problem: we deﬁne a language and try to determine whether a particular tuple is in that language. (See Section 4 for the formal deﬁnition.) Determining degree of responsibility and blame is a different type of problem, since we are determining which number represents the degree of responsibility (resp., blame). Formally, these are function problems. For ease of exposition, we restrict attention to functions from some strings over some ﬁxed language Σ to strings over Σ (i.e., we are considering functions from Σ to Σ ). For a complexity class A in the polynomial hierarchy, FPA[log n] consists of all functions that can be computed by a polynomial-time Turing machine with an A-oracle which on input x asks a total of O(log |x|) queries (Papadimitriou 1984). A function f(x) is FPA[log n]hard iff for every function g(x) in FPA[log n] there exist polynomially computable functions R, S : Σ Σ such that g(x) = S(f(R(x))). A function f(x) is complete in FPA[log n] iff it is in FPA[log n]

and is FPA[log n]-hard. Finally, for a complexity class A in polynomial hierarchy, FPA || is the class of functions that can be computed by a polynomial-time Turing machine with parallel (i.e., non-adaptive) queries to an A-oracle. (For background on these complexity classes, see (Jenner and Toran 1995; Johnson 1990).)

4 Complexity for the Updated HP Deﬁnition

In this section, we prove our results on the complexity of deciding causality. We start by deﬁning the problem formally. In the deﬁnitions, M stands for a causal model, u is a context, X is a subset of variables of M, and x is the set of values of X in (M, u):

Lcause = { M, u, ϕ, X, x : ( X = x) is a cause of ϕ in (M, u)}.

One of our goals is to understand the cause of the complexity of computing causality. Towards this end, it is useful to deﬁne two related languages:

LAC2 = { M, u, ϕ, X, x : ( X = x) satisﬁes conditions AC1 and AC2 of Def. 2.1 for ϕ in (M, u)}, LAC3 = { M, u, ϕ, X, x : ( X = x) satisﬁes conditions AC1 and AC3 of Def. 2.1 for ϕ in (M, u)}.

It is easy to see that Lcause = LAC2 LAC3.

Let L1cause be the subset of Lcause where X and x are singletons; this is the singleton causality problem. We can similarly deﬁne L1 AC2 and L1 AC3. Again, we have L1cause = L1 AC2 L1 AC3, but, in fact, we have L1cause = L1 AC2, since L1 AC2 L1 AC3; for singleton causality, the minimality condition AC3 trivially holds. We denote by LBcause the language of causality for binary causal models (i.e., where the models M in the tuple are binary models), and by LB AC2 and LB AC3 the languages LAC2 and LAC3 restricted to binary causal models. Again we have that LBcause = LB AC2 LB AC3. And again, we can deﬁne LB,1 cause, LB,1 AC2, and LB,1 AC3, and we have LB,1 cause = LB,1 AC2. We start by considering singleton causality. As we observed, Eiter and Lukasiewicz (2002) and Hopkins (2001) showed that, with the original HP deﬁnition, singleton causality and causality coincide. However, for the updated deﬁnition, Halpern (2008) showed that it is in fact possible to have minimal causes that are not singletons. Thus, we consider singleton causality and general causality separately. We can clarify where the complexity lies by considering LAC2 (and its sublanguages) and LAC3 (and its sublanguages) separately.

Theorem 4.1 The languages LAC2, L1 AC2, LB,1 AC2, and L1 AC2 are ΣP 2 -complete.

Proof outline: To show all these languages are in ΣP , given a tuple M, u, ϕ, X, x , checking that AC1 holds, that is, checking that (M, u) |= X = x ϕ, can be done in time polynomial in the size of M, | X|, and |ϕ| (the length of ϕ as a string of symbols). For AC2, we need only guess the set W and the assignment w. The check that assigning w to W and x to X indeed falsiﬁes ϕ is polynomial, and we use an NP oracle to check that for all subsets of W and all subsets of Z, condition AC2(b) holds. (The argument is quite similar to Eiter and Lukasiewicz s argument that causality is in ΣP 2 for general models with the original HP deﬁnition.) For hardness, it clearly sufﬁces to show that LB,1 AC2 is ΣP 2 -

hard. We do this by reducing ΣP 2 (SAT) to LB,1 AC2. Given

a CQBF formula X Y ϕ, we show that we can efﬁciently construct a causal formula ψ, model M, and context u such that X Y ϕ = true iff (M, u, ψ, A, 0) LB,1 AC2. We leave details to the full paper.

Since, as we have observed, AC3 is vacuous in the case of singleton causality, it follows that singleton causality is ΣP 2 -complete.

Corollary 4.2 L1cause and LB,1 cause are Σ2-complete.

We now show that things are harder if we do not restrict to binary causal models (unless the polynomial hierarchy collapses). As a ﬁrst step, we consider the complexity of LAC3 and LB AC3.

Theorem 4.3 LAC3 and LB AC3 are ΠP 2 -complete.

Proof outline: The fact that LAC3 and LB AC3 are in ΠP 2 is straightforward. Again, given a tuple M, u, ϕ, X, x , we can check that AC1 holds in polynomial time. For AC3, we need to check that for all strict subsets X of X, AC2 fails. Since checking AC2 is in ΣP 2 , checking that it fails is in ΠP 2 . Checking that it fails for all strict subsets X keeps it in ΠP 2 (since it just adds one more universal quantiﬁer). To prove that these languages are ΠP 2 -hard, we show that we can reduce ΠP 2 (SAT) to LB AC3. The proof is similar in spirit to the proof of Theorem 4.1; we leave details to the full paper.

We are now ready to prove our main result. Theorem 4.4 Lcause and LBcause are DP 2 -complete.

Proof: Membership of Lcause (and hence also LBcause) in DP 2 follows from the fact that Lcause = LAC2 LAC3, LAC2 ΣP 2 , and LAC3 ΠP 2 . The fact that LBcause (and hence also Lcause) are DP 2 -hard follows from Lemma 3.2 and Theorems 4.1 and 4.3.

5 Responsibility and Blame In this section, we review the deﬁnitions of responsibility and blame and characterize their complexity. See Chockler and Halpern (2004) for more intuition and details.

5.1 Responsibility The deﬁnition of responsibility given by Chockler and Halpern (2004) was given based on the original HP deﬁnition of causality, and thus assumed that causes were always single conjuncts. It is straightforward to extend it to allow causes to have arbitrarily many conjuncts.

Deﬁnition 5.1 The degree of responsibility of X = x for ϕ in (M, u), denoted dr((M, u), ( X = x), ϕ), is 0 if X = x is not a cause of ϕ in (M, u); it is 1/(k + 1) if X = x is a cause of ϕ in (M, u) and there exists a partition ( Z, W) and setting ( x , w) for which AC2 holds such that (a) k variables in W have different values in w than they do in the context u and (b) there is no partition ( Z , W ) and setting ( x , w ) satisfying AC2 such that only k < k variables have different values in w than they do the context u.

Intuitively, dr((M, u), ( X = x), ϕ) measures the minimal number of changes that have to be made in u in order to make ϕ counterfactually depend on X, provided the conditions on the subsets of W and Z are satisﬁed (see also the voting example from the introduction). If there is no partition of V to ( Z, W) that satisﬁes AC2, or ( X = x) does not satisfy AC3 for ϕ in (M, u), then the minimal number of changes in u in Deﬁnition 5.1 is taken to have cardinality , and thus the degree of responsibility of ( X = x) is 0 (and hence it is not a cause). In the original HP model, it was shown that computing responsibility is FPNP[log n]-complete in binary causal models (Chockler, Halpern, and Kupferman 2008) and FPΣP 2 [log n]- complete in general causal models (Chockler and Halpern

2004). We now characterize the complexity of computing responsibility in the updated HP deﬁnition.

Theorem 5.2 Computing the degree of responsibility is FPΣP 2 [log n]-complete for singleton causes in binary and general causal models.

Proof outline: The proof is quite similar to the proof in (Chockler and Halpern 2004). We prove membership by describing an algorithm in FPΣP 2 [log n]for computing the degree of responsibility. Roughly speaking, the algorithm queries an oracle for the language R = {( (M, u), (X = x), ϕ, i such that (M, u), (X = x), ϕ Lcause and the degree of responsibility of (X = x) for ϕ is at least i}. It is easy to see that R is in ΣP 2 by using Corollary 4.2. The algorithm for computing the degree of responsibility performs a binary search on the value of dr((M, u), (X = x), ϕ), each time dividing the range of possible values for the degree of responsibility by 2 according to the answer of R. The number of possible candidates for the degree of responsibility is bounded by the size of the input n, and thus the number of queries is at most log n . For hardness in binary causal models (which implies hardness in general causal models), we provide a reduction from the ΣP 2 -complete problem MINQSAT2 (Chockler and Halpern 2004) to the degree of responsibility, where MINQSAT2( X Y ψ) is the minimum number of 1 s in the satisfying assignment to X for X Y ψ if such an assignment exists, and | X| + 1 otherwise.

Theorem 5.3 Computing the degree of responsibility is FPD2[log n]-complete in binary and general causal models.

5.2 Blame The deﬁnition of blame addresses the situation where there is uncertainty about the true situation or how the world works . Blame, introduced in (Chockler and Halpern 2004), considers the true situation to be determined by the context, and how the world works to be determined by the structural equations. An agent s uncertainty is modeled by a pair (K, Pr), where K is a set of pairs of the form (M, u), where M is a causal model and u is a context, and Pr is a probability distribution over K. A pair (M, u) is called a situation. We think of K as describing the situations that the agent considers possible before X is set to x. The degree of blame that setting X to x has for ϕ is then the expected degree of responsibility of X = x for ϕ in (M X x, u), taken over the situations (M, u) K. Note that the situation (M X x, u) for (M, u) K are those that the agent considers possible after X is set to x.

Deﬁnition 5.4 The degree of blame of setting X to x for ϕ relative to epistemic state (K, Pr), denoted db(K, Pr, X x, ϕ), is X

(M, u) K dr((M X x, u), X = x, ϕ) Pr((M, u)).

For the original HP deﬁnition of cause, Chockler and Halpern (2004) show that computing the degree of blame is

complete in FPΣP 2 || for general and in FPNP || for binary causal models. Again, with the updated HP deﬁnition, the complexity changes. Theorem 5.5 The problem of computing blame in recursive causal models is FPΣP 2 || -complete for singleton causes and

FPD2 || -complete for (general) causes, in binary and general causal models.

Acknowledgements: Joseph Halpern was supported in part by NSF grants IIS-0911036 and CCF-1214844, AFOSR grant FA9550-08-1-0438, ARO grant W911NF-14-1-0017, and by the Do D Multidisciplinary University Research Initiative (MURI) program administered by AFOSR under grant FA9550-12-1-0040.

References Beer, I.; Ben-David, S.; Chockler, H.; Orni, A.; and Treﬂer, R. J. 2012. Explaining counterexamples using causality. Formal Methods in System Design 40(1):20 40. Chockler, H., and Halpern, J. 2004. Responsibility and blame: a structural-model approach. Journal of Artiﬁcial Intelligence Research (JAIR) 22:93 115. Chockler, H.; Halpern, J. Y.; and Kupferman, O. 2008. What causes a system to satisfy a speciﬁcation? ACM Transactions on Computational Logic 9(3). Collins, J.; Hall, N.; and Paul, L. A., eds. 2004. Causation and Counterfactuals. Cambridge, Mass.: MIT Press. Eiter, T., and Lukasiewicz, T. 2002. Complexity results for structure-based causality. Artiﬁcial Intelligence 142(1):53 89. Gerstenberg, T., and Lagnado, D. 2010. Spreading the blame: the allocation of responsibility amongst multiple agents. Cognition 115:166 171. Hall, N. 2004. Two concepts of causation. In Collins, J.; Hall, N.; and Paul, L. A., eds., Causation and Counterfactuals. Cambridge, Mass.: MIT Press. Halpern, J. Y., and Pearl, J. 2001. Causes and explanations: A structural-model approach. Part I: Causes. In Proc. Sev-

enteenth Conference on Uncertainty in Artiﬁcial Intelligence (UAI 2001), 194 202. Halpern, J. Y., and Pearl, J. 2005. Causes and explanations: A structural-model approach. Part I: Causes. British Journal for Philosophy of Science 56(4):843 887. Halpern, J. Y. 2008. Defaults and normality in causal structures. In Principles of Knowledge Representation and Reasoning: Proc. Eleventh International Conference (KR 08). 198 208. Hopkins, M., and Pearl, J. 2003. Clarifying the usage of structural models for commonsense causal reasoning. In Proc. AAAI Spring Symposium on Logical Formalizations of Commonsense Reasoning. Hopkins, M. 2001. A proof of the conjunctive cause conjecture. Unpublished manuscript. Hume, D. 1739. A Treatise of Human Nature. London: John Noon. Jenner, B., and Toran, J. 1995. Computing functions with parallel queries to NP. Theoretical Computer Science 141:175 193. Johnson, D. S. 1990. A catalog of complexity classes. In Leeuwen, J. v., ed., Handbook of Theoretical Computer Science, volume A. Elsevier Science. chapter 2. Lagnado, D.; Gerstenberg, T.; and Zultan, R. 2013. Causal responsibility and counterfactuals. Cognitive Science 37:1036 1073. Meyer, A., and Stockmeyer, L. 1972. The equivalence problem for regular expressions with squaring requires exponential time. In Proc. 13th IEEE Symp. on Switching and Automata Theory, 125 129. Papadimitriou, C. H., and Yannakakis, M. 1984. The complexity of facets (and some facets of complexity). J. Comput. Syst. Sci. 28(2):244 259. Papadimitriou, C. H. 1984. The complexity of unique solutions. Journal of ACM 31:492 500. Pearl, J. 2000. Causality: Models, Reasoning, and Inference. New York: Cambridge University Press. Stockmeyer, L. J. 1977. The polynomial-time hierarchy. Theoretical Computer Science 3:1 22. Wrathall, C. 1976. Complete sets and the polynomial-time