# nested_counterfactual_identification_from_arbitrary_surrogate_experiments__af8e20cb.pdf Nested Counterfactual Identification from Arbitrary Surrogate Experiments Juan D. Correa Columbia University jdcorrea@cs.columbia.edu Sanghack Lee Seoul National University sanghack@snu.ac.kr Elias Bareinboim Columbia University eb@cs.columbia.edu The Ladder of Causation describes three qualitatively different types of activities an agent may be interested in engaging in, namely, seeing (observational), doing (interventional), and imagining (counterfactual) (Pearl and Mackenzie, 2018). The inferential challenge imposed by the causal hierarchy is that data is collected by an agent observing or intervening in a system (layers 1 and 2), while its goal may be to understand what would have happened had it taken a different course of action, contrary to what factually ended up happening (layer 3). While there exists a solid understanding of the conditions under which cross-layer inferences are allowed from observations to interventions, the results are somewhat scarcer when targeting counterfactual quantities. In this paper, we study the identification of nested counterfactuals from an arbitrary combination of observations and experiments. Specifically, building on a more explicit definition of nested counterfactuals, we prove the counterfactual unnesting theorem (CUT), which allows one to map arbitrary nested counterfactuals to unnested ones. For instance, applications in mediation and fairness analysis usually evoke notions of direct, indirect, and spurious effects, which naturally require nesting. Second, we introduce a sufficient and necessary graphical condition for counterfactual identification from an arbitrary combination of observational and experimental distributions. Lastly, we develop an efficient and complete algorithm for identifying nested counterfactuals; failure of the algorithm returning an expression for a query implies it is not identifiable. 1 Introduction Counterfactuals provide the basis for notions pervasive throughout human affairs, such as credit assignment, blame and responsibility, and regret. One of the most powerful constructs in human reasoning what if? questions evokes hypothetical conditions usually contradicting the factual evidence. Judgment and understanding of critical situations found from medicine to psychology to business involve counterfactual reasoning, e.g.: Joe received the treatment and died, would he be alive had he not received it?, Had the candidate been male instead of female, would the decision from the admissions committee be more favorable?, or Would the profit this quarter remain within 5% of its value had we increased the price by 2%? . By and large, counterfactuals are key ingredients that go in the construction of explanations about why things happened as they did [17, 19]. The structural interpretation of causality provides proper semantics for representing counterfactuals [17, Ch. 7]. Specifically, each structural causal model (SCM) M induces a collection of distributions related to the activities of seeing (called observational), doing (interventional), and imagining (counterfactual). The collection of these distributions is known as the Ladder of causation [19], and has also been called the Pearl s Causal Hierarchy (PCH, for short) [2]. The PCH is a containment hierarchy; each type of distribution can be put in increasingly refined layers: observational content goes in layer 1; experimental in layer 2; counterfactual in layer 3; see Fig. 1. 35th Conference on Neural Information Processing Systems (Neur IPS 2021). Structural Causal Model (Unobserved Nature) P(X, Y ) Layer 1 P(Y |do(X)) Layer 2 P(Yx|x , y ) Layer 3 Figure 1: Every SCM induces different quantities in each layer of the PCH. It is understood that if we have all the information in the world about layer 1, there are still questions about layers 2 and 3 that are unanswerable, or technically undetermined; further, if we have data from layers 1 and 2, there are still questions in the world about layer 3 that are underdetermined [17, 2]. The inferential challenge in these settings arises because the generating model M is not fully observed, nor data from all of the layers are necessarily available, perhaps due to the cost or the infeasibility of performing certain interventions. One common task found in the literature is to determine the effect of an intervention of a variable X on an outcome Y , say P(Y |do(X)) (layer 2), using data from observations P(V) (layer 1), where V is the set of observed variables, and possibly other interventions, e.g., P(V|do(Z)). Also, qualitative assumptions about the system are usually articulated in the form of a causal diagram G. This setting has been studied in the literature under the rubric of non-parametric identification from a combination of observations and experiments. Multiple solutions exist, including Pearl s celebrated do-calculus [16], and other increasingly refined methods that are computationally efficient, sufficient, and necessary [25, 8, 20, 26, 21, 10, 3, 13, 12].1 There is a growing literature about cross-layer inferences from data in layers 1 and 2 to quantities in layer 3. For example, a data scientist may be interested in evaluating the effect of an intervention on the group of subjects who receive the treatment instead of those randomly assigned to it. This measure is known as the effect of treatment on the treated [9, 17], and there exists a graphical condition for mapping it to a (layer 2) causal effect [23]. Further, there are also results on the identification of pathspecific effects, which are counterfactuals that isolate specific paths in the graph [18, 1]. In particular, [24] provides a sufficient and necessary algorithm for identification of these effects from observational data, and [28] provides identification conditions from observational and experimental data in general canonical models. Further, [22] studied counterfactual identification under the assumption that all experimental distributions (i.e., over every subset of the observed variables) are available.2 In this paper, our goal is to identify the probability distribution of (possibly nested) counterfactual events from an arbitrary combination of user-specified observational and experimental distributions. To the best of our knowledge, this provides the first general treatment of nested counterfactual identification from arbitrary data collections. Moreover, it also provides the first, graphical and algorithmic, sufficient and necessary conditions for the identifications of counterfactuals from observational data alone (when no experimental data is available) and arbitrary causal diagrams. Moving up the PCH, our results allow for arbitrary quantities as inferential targets and for the addition of arbitrary experimental distributions to the input, increasing the flexibility of the solution. Figure 2: Causal diagram with treatment X, outcome Y , and mediator M. For concreteness, consider the causal diagram shown in Fig. 2 and a counterfactual query called direct effect. This quantity represents the sensitivity of a variable Y to changes in another variable X while all other factors in the analysis remain fixed. Suppose X is level of exercise, M cholesterol levels, and Y cardiovascular disease. Exercising can improve cholesterol levels, which in turn affect the chances of developing cardiovascular disease. An interesting question is how much exercise prevents the disease by means other than regulating cholesterol. In counterfactual notation, this is to compare Yx,Mx and Yx ,Mx, where x and x are different values. The first quantity represents the value of Y when X=x and M varies accordingly. The second expression is the value Y attains if X is held constant at x while M still follows X=x. The difference E[Yx ,Mx Yx,Mx] known as the natural direct effect (NDE) is non-zero if there is some direct effect of X on Y . In this instance, this nested counterfactual is identifiable only if observational data and experiments on X are available. 1In fact, this is a classic task in a larger family of problems known as data fusion, which include other challenges such as selection bias, transportability, to cite a few. For more details, see [4]. 2For the sake of context, the work proposed here can be seen as a generalization of two tasks, counterfactual identification under the assumptions discussed earlier [22] and interventional identification from arbitrary experiments [13]. As discussed later on, we will be able to show, based on the machinery developed here, that the individual methods for those tasks can be combined and also be shown complete. After all, there is no general identification method for this particular counterfactual family (which also includes indirect and spurious effects) and, more broadly, other arbitrary nested counterfactuals that are well-defined in layer 3. Our goal is to understand the non-parametric identification of arbitrary nested and conditional counterfactuals when the input consists of any combination of observational and interventional distributions, whatever is available for the data scientist. More specifically, our contributions are as follows. 1. We look at nested counterfactuals from an SCM perspective and introduce machinery that supports counterfactual reasoning. In particular, we prove the counterfactual unnesting theorem (CUT), which allows one to map any nested counterfactual to an unnested one (Section 2). 2. Building on this new machinery, we derive sufficient and necessary graphical conditions and an algorithm to determine the identifiability of marginal nested counterfactuals from an arbitrary combination of observational and experimental distributions (Section 3). 3. We prove a reduction from conditional counterfactuals to marginal ones, and use it to derive a complete algorithm for their identification (Section 4). Due to space constraints, all the proofs in this paper can be found in the full technical report [5]. 1.1 Preliminaries We denote variables by capital letters, X, and values by small letters, x. Bold letters, X represent sets of variables and x sets of values. The domain of a variable X is denoted by XX. Two values x and z are said to be consistent if they share the common values for X Z. We also denote by x \ Z the value of X \ Z consistent with x and by x Z the subset of x corresponding to variables in Z. We assume the domain of every variable is finite. We relay on causal graphs and denote them with a calligraphic letter, e.g., G. We denote the set of vertices (i.e., variables) in G as V(G). Given a graph G, GWX is the result of removing edges coming into variables in W and going out from variables in X. G[W] denotes a vertex-induced subgraph including W and the edges among its elements. We use kinship notation for graphical relationships such as parents, children, descendants, and ancestors of a set of variables. For example, the set of parents of X in G is Pa(X)G := X S X X Pa(X)G. Similarly, we define Ch(), De(), and An(). To articulate and formalize counterfactual questions, we require a framework that allows us to reason simultaneously about events from alternative worlds. Accordingly, we employ the Structural Causal Model (SCM) paradigm [17, Ch. 7]. An SCM M is a 4-tuple U, V, F, P(u) , where U is a set of exogenous (latent) variables; V is a set of endogenous (observable) variables; F is a collection of functions such that each variable Vi V is determined by a function fi F. Each fi is a mapping from a set of exogenous variables Ui U and a set of endogenous variables Pai V \ {Vi} to the domain of Vi. Uncertainty is encoded through a probability distribution over the exogenous variables, P(U). An SCM M induces a causal diagram G where V is the set of vertices, there is a directed edge (Vj Vi) for every Vi V and Vj Pai, and a bidirected edge (Vi L99 99K Vj) for every pair Vi, Vj V such that Ui Uj = (Vi and Vj have a common exogenous parent). We assume that the underlying model is recursive. That is, there are no cyclic dependencies among the variables. Equivalently, that is to say, that the corresponding causal diagram is acyclic. The set V(G) can be partitioned into subsets called c-components [27] such that two variables belong to the same c-component if they are connected in G by a path made entirely of bidirected edges. 2 SCMs and Nested Counterfactuals Intervening on a system represented by an SCM M results in a new model differing only on the mechanisms associated with the intervened variables [15, 6, 7]. If the intervention consists on fixing the value of a variable X to a constant x XX, it induces a submodel, denoted Mx [17, Def. 7.1.2]. To formally study nested counterfactuals, we extend this notion to models derived from interventions that replace functions from the original SCM with other, not necessarily constant, functions. Definition 1 (Derived Model). Let M be an SCM, b U U, X V, and b X : b U XX a function. Then, M b X, called the derived model of M according to b X, is identical to M, except that the function f X is replaced with a function bf X identical to b X. This definition is easily extendable to models derived from an intervention on a set X instead of a singleton. When b X is a collection of functions { b X : b UX XX}X X, the derived model M b X is obtained by replacing each f X with b X for X X. Next, we discuss the concept of potential response [17, Def. 7.4.1] with respect to the derived models. Definition 2 (Potential Response). Let X, Y V be subsets of observable variables, let u be a unit, and let b X(u) be a set of functions from b UX XX, for X X where b UX U. Then, YX= b X(u) (or Y b X(u), for short) is called the potential response of Y to X = b X, and is defined as the solution of Y, for a particular u, in the derived model M b X. A potential response Y b X(u) describes the value that variable Y would attain for a unit (or individual) u if the intervention b X is performed. This concept is tightly related to that of potential outcome, but the former explicitly allows for interventions that do not necessarily fix the variables in X to a constant value. Averaging over the space of U, a potential response Y b X(u) induces a random variable that we will denote simply as Y b X. If the intervention replaces a function f X with a potential response of X in M, we say the intervention is natural. When variables are enumerated as W1, W2, . . ., we may add square brackets around the part of the subscript denoting interventions. We use W to denote sets of arbitrary counterfactual variables. Let W = {W1[b T1], W2[b T2], . . .} represent a set of counterfactual variables such that Wi V and Ti V for i = 1, . . . , l. Define V(W ) = {W V | Wb T W }, that is, the set of observables that appear in W . Let w represent a vector of values, one for each variable in W and define w (X ) as the subset of w corresponding to X for any X W . The probability of any counterfactual event is given by P(Y = y ) = X {u|Y (u)=y } P(u), (1) where the predicate Y (u) = y means V {Y b X Y } Y b X(u) = y. When all variables in the expression have the same subscript, that is, they belong to the same submodel; we will often denote it as Px(W1, W2, . . .). For most real-world scenarios, having access to a fully specified SCM of the underlying system is unfeasible. Nevertheless, our analysis does not rely on such privileged access but the aspects of the model captured by the causal graph and data samples generated by the unobserved model. 2.1 Nested Counterfactuals Potential responses can be compounded based on natural interventions. For instance, the counterfactual YZx(u) (YZ=Zx(u)) can be seen as the potential response of Y to an intervention that makes b Z equal to Zx. Notice that Zx(u) is in itself a potential response, but from a different (nested) model. Hence we call YZx a nested counterfactual. Recall the causal diagram in Fig. 2 and consider once again the NDE as NDEx x ,Z(Y ) = E[Yx Zx] E[Yx]. (2) The second term is also equal to Yx Zx as Zx is consistent with X = x, so it is the value Y listens to in Mx. Meanwhile, the first one is related to P(Yx Zx), the probability of a nested counterfactual. 2.2 Tools for Counterfactual Reasoning Before characterizing the identification of counterfactuals from observational and experimental data, we develop from first principles a canonical representation of any such query. First, we extend the notion of ancestors for counterfactual variables, which subsumes the usual one described before. Definition 3 (Ancestors, of a counterfactual). Let Yx be such that Y V, X V. Then, the set of (counterfactual) ancestors of Yx, denoted An(Yx), consist of each Wz, such that W An(Y )GX (which includes Y itself), and z = x An(W)GX. For a set of variables W , we define An(W ) as the union of the ancestors of each variable in the set. That is, An(W ) = S Wt W An(Wt). For instance, in Fig. 3(a), An(Yx) = {Yx, Z}, An(Xyz) = (a) Backdoor graph. (b) Graphical representation of the ancestors of Yz. (c) Napkin graph. (d) Graphical representation of the ancestors of Yx. Figure 3: Two causal diagrams and the subgraphs considered when finding sets of ancestors for a counterfactual variable. {Xz} and An(Yz) = {Yz, Xz} (depicted in Fig. 3(b)). In Fig. 3(c) An(Z, Yz) = {Yz, Xz, Z, W} and An(Yx) = {Yx} (represented in Fig. 3(d)). Probabilistic and causal inference with graphical models exploits the local structure among variables, specifically parent-child relationships, to infer and even estimate probabilities. In particular, Tian [27] introduced c-factors which have proven instrumental in solving many problems in causal inference. We naturally generalize this notion to the counterfactual setting with the following definition. Definition 4 (Counterfactual Factor (ctf-factor)). A counterfactual factor is a distribution of the form P(W1[pa1] = w1, W2[pa2] = w2, . . . , Wl[pal] = wl), (3) where each Wi V and there could be Wi = Wj for some i, j {1, . . . , l}. For example, for Fig. 3(c) P(Yx = y, Yx = y ), P(Yx = y, Xz = x) are ctf-factors but P(Yz = y, Zw = z) is not. Using the notion of ancestrality introduced in Definition 3, we can factorize counterfactual probabilities as ctf-factors. Theorem 1 (Ancestral set factorization). Let W be an ancestral set, that is, An(W ) = W , and let w be a vector with a value for each variable in W . Then, P(W = w ) = P Wt W Wpaw = w , (4) where each w is wt and paw is determined for each Wt W as follows: (i) the values for variables in Paw T are the same as in t, and (ii) the values for variables in Paw \ T are taken from w corresponding to the parents of W. Proof outline. Following a reverse topological order in G, look at each Witi W . Since any parent of Wi not in Ti must appear in W , the composition axiom [17, 7.3.1] licenses adding them to the subscript. Then, by exclusion restrictions [16], any intervention not involving Pa(Wi) can be removed to obtain the form in Eq. (4). For example, consider the diagram in Fig. 3(c) and the counterfactual P(Yx = y | X = x ) known as the effect of the treatment on the treated (ETT) [9, 17]. First note that P(Yx = y | X = x ) = P(Yx = y, X = x )/P(X = x ) and that An(Yx, X) = {Yx, X, Z, W}, then P(Yx = y, X = x ) = X z,w P(Yx = y, X = x , Z = z, W = w). (5) Then, by Theorem 1 we can write P(Yx = y, X = x ) = X z,w P(Yx = y, Xz = x , Zw = z, W = w). (6) Moreover, the following result describes a factorization of ctf-factors based on the c-component structure of the graph, which will prove instrumental in the next section. Theorem 2 (Counterfactual factorization). Let P(W = w ) be a ctf-factor, let W1 < W2 < be a topological order over the variables in G[V(W )], and let C1, . . . , Ck be the c-components of the same graph. Define Cj = {Wpaw W | W Cj} and cj as the values in w corresponding to Cj , then P(W = w ) decomposes as P(W = w ) = Y j P(Cj = cj ). (7) Figure 4: Three causal diagrams representing plausible structures in mediation analysis. Furthermore, each factor can be computed from P(W = w) as P(Cj = cj ) = Y {w|Wpaw W ,Wi