# probabilistic_sufficient_explanations__7c9f7861.pdf

Probabilistic Sufﬁcient Explanations

Eric Wang , Pasha Khosravi and Guy Van den Broeck University of California, Los Angeles ericzxwang@ucla.edu, {pashak,guyvdb}@cs.ucla.edu

Understanding the behavior of learned classiﬁers is an important task, and various black-box explanations, logical reasoning approaches, and modelspeciﬁc methods have been proposed. In this paper, we introduce probabilistic sufﬁcient explanations, which formulate explaining an instance of classiﬁcation as choosing the simplest subset of features such that only observing those features is sufﬁcient to explain the classiﬁcation. That is, sufﬁcient to give us strong probabilistic guarantees that the model will behave similarly when all features are observed under the data distribution. In addition, we leverage tractable probabilistic reasoning tools such as probabilistic circuits and expected predictions to design a scalable algorithm for ﬁnding the desired explanations while keeping the guarantees intact. Our experiments demonstrate the effectiveness of our algorithm in ﬁnding sufﬁcient explanations, and showcase its advantages compared to Anchors and logical explanations.

1 Introduction Machine learning models are becoming ubiquitous, and are being used in critical and sensitive areas such as medicine, loan applications, and risk assessment in courts. Hence, unexpected and faulty behaviors in machine learning models can have signiﬁcant negative impact on people. As a result, there is much focus on explaining and understanding the behavior of such models. Explainable AI (or XAI) is an active area of research that aims to tackle these issues. There have been many approaches toward explaining an instance of a classiﬁcation (called a local explanation) from different perspectives, including logic-based [Shih et al., 2018; Ignatiev et al., 2019a; Darwiche and Hirth, 2020] or modelagnostic approaches [Ribeiro et al., 2016; Lundberg and Lee, 2017; Ribeiro et al., 2018]. Each of these methods have their pros and cons; some focus on scalability and ﬂexibility, and some focus on providing guarantees. In this work, we strive to probabilistically explain an instance of classiﬁcation. Explanations are partial examples, where we treat the features not in the explanation as missing values. We aim for our explanations to be as simple as

possible while providing the following sufﬁciency guarantee: given only the features in the explanation, with high probability under the data distribution Pr(X), the classiﬁer makes the same prediction as on the full example. Simplicity and sufﬁciency are often at odds with each other, hence balancing them is a challenging act. We brieﬂy overview other approaches to local explanations and discuss their pros and cons in the framework of sufﬁciency and simplicity. Broadly, model-agnostic methods are more scalable and ﬂexible but tend to be not as reliable as logic-based methods in providing sufﬁciency. On the other hand, logical explanation methods tend to sacriﬁce simplicity for deterministic guarantees of sufﬁciency. We then motivate how probabilistic notions of sufﬁciency can overcome these shortcomings as the foundation for sufﬁcient explanations. Next, we introduce the probabilistic reasoning tools needed to quantify sufﬁciency and deﬁne sufﬁcient explanations: the Same-Decision Probability (SDP) and Expected Prediction (EP). We use probabilistic circuits (PCs) [Choi et al., 2020] to model the probability distribution Pr(X) over the features due to their expressivity and tractability in answering complex probabilistic queries. We also explore connections between SDP and expected prediction and show why expected prediction is better suited for providing the probabilistic guarantees for sufﬁcient explanations. We then formalize the desired properties of sufﬁcient explanations and motivate our choices for sufﬁciency and simplicity. Sufﬁciency leads to maximizing the expected prediction, while simplicity leads to constraining the size of explanations as well as choosing the subset that also maximizes the marginal probability. We capture all these with an optimization problem for ﬁnding sufﬁcient explanations. Then, we design a scalable algorithm for ﬁnding the most likely sufﬁcient explanation by leveraging tractability of expected prediction. We provide experiments showcasing the empirical advantage of sufﬁcient explanations and the effectiveness of our search algorithm. Our advantages include: (i) compared to Anchors we get better and more accurate sufﬁciency guarantees and (ii) our method ﬁnds simpler explanations and is more scalable than logical methods. Lastly, we show the tradeoffs between sufﬁciency and simplicity, and show that slightly reducing the guarantee can lead to simpler explanations with higher likelihood.

Proceedings of the Thirtieth International Joint Conference on Artiﬁcial Intelligence (IJCAI-21)

2 Background and Related Work Notation. We use uppercase letters (X) for features (random variables) and lowercase letters (x) for their value assignments. Analogously, we denote sets of features in bold uppercase (X) and their assignments in bold lowercase (x). We denote the set of all possible assignments to X as X. Concatenation XY denotes the union of disjoint sets. We focus on discrete features unless otherwise noted. We represent a probabilistic predictor as f : X [0, 1] and its decision function as C : X {0, 1}, with T denoting the decision threshold. Hence, C(x) = Jf(x) TK. Sometimes we want to directly deal with log-odds instead of probabilities, in which case we use the log-odds predictor O : X R which is deﬁned as O(x) = log f(x) 1 f(x).

2.1 Related Work Computing explanations of classiﬁers has been studied from many different perspectives, including logical reasoning, black-box methods, and model-speciﬁc approaches. Some try to explain the learned model globally, making it more interpretable [Guidotti et al., 2018; Liang and Van den Broeck, 2019], while others focus more locally on explaining its prediction for a single instance. Next, we go over some local explanation methods, discuss their pros and cons, and motivate how our framework might solve those issues.

Model Agnostic Approaches These methods treat the classiﬁer as a black box. Given an input instance to explain, they perturb the instance in many different ways and evaluate the model on those perturbed instances. Then, they use the results from the perturbations to generate an explanation. Two popular methods under this umbrella are Lime [Ribeiro et al., 2016] and SHAP [Lundberg and Lee, 2017]. The main difference between these methods is the heuristics used to obtain perturbed instances and how to analyze the predictions on these local perturbations. Most provide feature attributions, which are real-valued numbers assigned to each feature, to indicate their importance to the decision and in what direction. A beneﬁt of these methods is that they can be used to explain any model and are generally more ﬂexible and scalable than their alternatives. On the other hand, the downsides are that they can be very sensitive to the choice of local perturbations and might produce over-conﬁdent results [Ignatiev et al., 2019b] or be fooled by adversarial methods [Slack et al., 2020; Dimanov et al., 2020]. One of the main reasons for these downsides is that the distribution of the local perturbations tends to be different from the data distribution the classiﬁer was originally trained on. Hence, these approaches do not beneﬁt from the intended generalization guarantees of machine learning models. Moreover, some of the perturbations might be low probability or even impossible inputs, and we might not care as much about their classiﬁcation outcome. Additionally, feature attribution methods treat each feature independently and cannot easily capture interactions between bigger subsets of features such as when two features cancel each other s effects. We refer to Camburu et al. [2020] for more such examples and discussion on pros/cons of attribution based explanations.

Logical Reasoning Approaches These methods provide explanations with some principled guarantees by leveraging logical reasoning tools. Some approaches use knowledge compilation and tractable Boolean circuits [Shih et al., 2018; Darwiche and Hirth, 2020; Shi et al., 2020], some adopt the framework of abductive reasoning [Ignatiev et al., 2019a; Ignatiev et al., 2019b], and some tackle a speciﬁc family of models such as linear models [Marques-Silva et al., 2020], decision trees [Izza et al., 2020], or tree ensembles [Devos et al., 2020]. The main beneﬁt of these approaches is that they guarantee provably correct explanations, that is they guarantee a certain prediction for all examples described by the explanation. On the other hand, one downside is that they are generally not as scalable (in the number of features) as black-box methods. Another downside is that they need to completely remove the uncertainty from the classiﬁer to be able to use logical tools and therefore become more rigid. In particular, in order to guarantee a certain outcome with absolute certainty, it is often necessary to include many of the features into the explanation, making the explanation more complex. Sufﬁcient Reasons [Shih et al., 2018; Darwiche and Hirth, 2020] is one example of these methods that selects as an explanation a minimal subset of features guaranteeing that, no matter what is observed for the remaining features, the decision will stay the same. Sufﬁcient reasons, as well as related logical explanations, ensure minimality and deterministic guarantees in the outcome, while as we see later our sufﬁcient explanations ensure probabilistic guarantees instead. For a recent and more comprehensive comparison of logicbased vs. model-agnostic explanation methods, we refer to Ignatiev et al.; Ignatiev [2019b; 2020].

3 Motivation and Problem Statement

We will overcome the limitations of both the model-agnostic and logic-based approaches by building local explanation methods that are aware of the distribution over features Pr(X). We assume for now that we have access to an accurate model for the feature distribution, and discuss in Section 4.1 how it can be obtained. This distribution will allow us to (i) reason about the classiﬁer s behavior on realistic input instances and (ii) provide probabilistic guarantees on the veracity of the explanations. We thus take a principled probabilistic approach in explaining an instance of classiﬁcation. Intuitively, given an instance x and the classiﬁers outcome C(x), we would like to choose a subset of features z x as the simplest sufﬁcient explanation. Firstly, we want it to be sufﬁcient, in that it provides strong probabilistic guarantees about the outcome of the classiﬁer when only features z are observed. To this end, we want to maximize some sufﬁciency metric F(z, x, f, Pr) describing the behavior of our predictor when only the explanation z is observed. However, naively maximizing this metric can run into the pitfall of building more complex explanations in order to squeeze out tiny improvements in the sufﬁciency metric. To address this, we have our second metric of simplicity S(z, Pr), which we use to deﬁne a constraint when maximizing the sufﬁciency metric. Putting these together, we call a subset z of an instance x a

Proceedings of the Thirtieth International Joint Conference on Artiﬁcial Intelligence (IJCAI-21)

sufﬁcient explanation if it maximizes some sufﬁciency metric F under some simplicity constraint. With this current deﬁnition of sufﬁcient explanations, there is still the possibility that two subsets of a given instance, one a subset of the other, are both simple and have the same degree of sufﬁciency. This motivates us to further deﬁne a minimal sufﬁcient explanation z, i.e. no subset of z is also a sufﬁcient explanation. This condition is exactly the meaning of prime in prime implicants as they are used in logical explanations. We will see in Section 5 how we can use the feature distribution to ensure this minimality property.

4 Probabilistic Notions of Sufﬁciency We next introduce two notions of sufﬁciency using probabilistic reasoning tools and discuss their pros and cons. Then, we discuss their connections. Finally, we use these tools in Section 5 to formalize our deﬁnition and formulate ﬁnding the desired subsets of features as an optimization problem.

4.1 Probabilistic Reasoning Tools Probabilistic reasoning is a hard task in general, so we need to choose our probabilistic model for Pr(X) carefully. We choose probabilistic circuits (PCs), which given some structural constraints enable tractable and exact computation of probabilistic reasoning queries such as marginals [Choi et al., 2020]. Moreover, they do so without giving up much expressivity. Another advantage of PCs is that we can learn their structure and parameters from data, allowing us to avoid the exponential worst-case behavior of other probabilistic models. The two main probabilistic reasoning tools that we use for our explanations are the Same Decision Probability (SDP) [Chen et al., 2012] and Expected Prediction (EP) [Khosravi et al., 2019b]. We introduce them next and explore their tradeoffs and connections. First we have SDP [Chen et al., 2012] which, intuitively, gives us the probability that our classiﬁer has the same output as C(x) given only some subset of observed features z.1

Deﬁnition 1 (Same Decision Probability). Given a classiﬁer C, a distribution Pr(X) over features, a partition ZM of features X, and an assignment x to X (and corresponding z x), the same decision probability (SDP) of z w.r.t. x is

SDPC,x(z) = E m Pr(M|z)JC(zm) = C(x)K.

The higher the SDP, the better guarantee we have that the partial example z will be classiﬁed the same way as the full example x. SDP and related notions have been successfully used in applications such as trimming Bayesian network classiﬁers [Choi et al., 2017] and robust feature selection [Choi and Van den Broeck, 2018]. Renooij [2018] introduced various theoretical properties and bounds on the SDP. Other explanation methods that provide sufﬁciency guarantees can be ﬁt into our framework, using SDP as the sufﬁciency metric. Notably, Anchors [Ribeiro et al., 2018] can

1SDP was originally deﬁned for the classiﬁer being a conditional probability test in distribution Pr. Here, we slightly generalize SDP to apply to a distribution Pr with a separate classiﬁer C.

be thought of as an empirical approximation of SDP sufﬁciency, as they aim to provide sufﬁciency guarantees based on sampling local perturbations instead of relying on the data distribution. Logical explanations also ﬁt in this framework, as they completely prioritize having deterministic guarantees, i.e. SDP=1, over the need for simplicity. Another example of a method that uses probabilistic sufﬁciency can be found in Khosravi et al. [2019b], where they explain logistic regression models w.r.t. Naive Bayes data distributions. Expected Prediction is another probabilistic reasoning task that has been shown to be successful in handling missing values in classiﬁcation [Khosravi et al., 2019a; Khosravi et al., 2019b; Khosravi et al., 2020]. It provides a promising alternative for SDP in explanations. Intuitively, given some partial observation, expected prediction can be thought of as trying all possible ways of imputing the remaining features and computing an average of all subsequent predictions weighted by the probability of each imputation. In many cases, our classiﬁers directly output a probability and in those cases we can compute expected prediction as follows: Deﬁnition 2 (Expected Prediction). Given a probabilistic predictor f, a distribution Pr(X) over features, a partition ZM of features X, and an assignment z to Z, the expected prediction of f on z is

EPf(z) = E m Pr(M|z)f(zm).

For some tasks, we may care more about odds rather than probabilities. In those cases, the predictor usually outputs the log-odds O(x), so we also deﬁne expected log-odds: Deﬁnition 3 (Expected Log-Odds). Given a log-odds predictor O, a distribution Pr(X) over features, a partition ZM of features X, and an assignment z to Z, the expected log-odds of f on z is

EPO(z) = E m Pr(M|z)O(zm).

In this paper, unless otherwise noted, expected prediction denoted as EP(z) could refer to both cases.

4.2 Connections between SDP and EP The choice between using SDP and expected prediction as a sufﬁciency metric is important, as it will decide what kind of guarantees our explanations provide and how efﬁcient they are to compute. Here we explore connections between SDP and expected prediction and provide intuition on some advantages of using expected prediction in deﬁning sufﬁcient explanations. One of the main differences between SDP and expected prediction is their computational complexity. While SDP is an appealing criteria to use for selecting explanations, computing the SDP exactly is computationally hard. In particular, it is PPPP-hard on Bayesian networks [Choi et al., 2012]. Even on a simple Naive Bayes model for both Pr and the classiﬁer, computing SDP is NP-hard [Chen et al., 2013]. On the other hand, expected predictions can be tractably computed for many different pairs of discriminative and generative models. For example, it is known to be tractable for the following cases: (i) logistic regression using a conformant

Proceedings of the Thirtieth International Joint Conference on Artiﬁcial Intelligence (IJCAI-21)

naive Bayes distribution [Khosravi et al., 2019b], (ii) decision trees w.r.t. probabilistic circuits (PCs) [Khosravi et al., 2020], (iii) discriminative circuits w.r.t. PCs [Khosravi et al., 2019a], and (iv) when both the feature distribution and predictor are deﬁned by the same PC distribution Pr. In the latter case, the predictor is the conditional probability Pr(c | X), and the feature distribution is Pr(X). Then, expected prediction can be reduced to probabilistic marginal inference in PCs which is tractable for decomposable circuits [Choi et al., 2020]. Another difference comes from how SDP and expected prediction handle the two distinct uncertainties that arise from the feature distribution and the classiﬁer. Although both are aware of the uncertainty from the feature distribution Pr, SDP ignores the uncertainty in the classiﬁer since it only deals with the decision function C(x). For example, SDP cannot distinguish between cases when the classiﬁer has low conﬁdence f(x) = T + ϵ (only slightly above decision threshold) and cases when the classiﬁer has high conﬁdence such as f(x) = 0.99. In many domains this distinction is vital. For example, doctors care more about the odds of a patient having cancer rather than binary decisions. For these reasons, we choose to use expected predictions to deﬁne and optimize for our explanations. The good news is that optimizing EP also allows us to maximize a lower bound on SDP. The next theorem provides a theoretical lower bound for SDP using expected predictions. Theorem 1. Given a predictor f (or O), its thresholded classiﬁer C, a positively classiﬁed instance x (i.e. C(x) = 1), a distribution Pr, and some subset of the features z x, we have:

SDPC,x(z) > EP(z) T

U(z) T . (1)

where EP(z) refers to expected prediction, and U(z) is an upper bound for the predictor after ﬁxing z, i.e. m U(z) f(zm). Moreover, if U(z) is a tight bound, then Equation 1 is also tight.

Proof. The proof is included in the appendix.2

Theorem 1 gives us two ways of computing a lower bound for SDP: using a probabilistic predictor f or a log-odds predictor O. Additionally, the theorem can be easily generalized to cases when C(x) = 0. See appendix for a more detailed discussion on which bound is better.

5 Probabilistic Sufﬁcient Explanations In this section, we use the aforementioned sufﬁciency metrics and introduce some desirable constraints to ensure simplicity. Then, we put everything together to formalize sufﬁcient explanations and introduce an optimization problem for ﬁnding them. Finally, we introduce a search algorithm for ﬁnding sufﬁcient explanations by modifying beam search and leveraging tractability of expected predictions. To simplify the deﬁnitions, we assume without loss of generality that the instance we want to explain is positively classiﬁed, i.e. C(x) = 1. In Section 4.2, we saw two candidates that we can use for probabilistic sufﬁciency guarantees.

2Available at http://starai.cs.ucla.edu/papers/Wang IJCAI21.pdf

The ﬁrst was SDP, which focuses on the ﬁnal decision of the model. The second was expected prediction, which takes the conﬁdence of the model into account. Because of this, along with its computational tractability, we choose to use expected prediction for our sufﬁciency guarantees. More speciﬁcally, we want to maximize the expected prediction of our explanation to ensure our model is conﬁdent in its classiﬁcation. In addition, maximizing the expected prediction also maximizes a lower bound for SDP, so in practice we will still get good SDP guarantees. Having chosen a suitable sufﬁciency metric, we must now choose a simplicity constraint. There are multiple candidates to choose from for this as well. For example, we can impose a cardinality constraint or require explanations to have a high enough likelihood. We choose the cardinality constraint as it is easy to decide on a threshold on the explanation size. We can now formalize the notion of sufﬁcient explanations. Deﬁnition 4 (Sufﬁcient Explanations). Given a predictor f, distribution Pr, and a positively classiﬁed instance x (i.e. C(x) = 1), the set of sufﬁcient explanations for x is deﬁned as the solution of the following optimization problem:

argmax z x EP(z) s.t. |z| k

which we denote as SEk(x). Having deﬁned sufﬁcient explanations, and following the same line of thought as before, we are now interested in choosing sufﬁcient explanations which are minimal. For logical explanations, any minimal sufﬁcient subset can be chosen. However, we can be more selective in our choices using the feature distribution. A natural choice is to choose the most likely sufﬁcient explanations, as these are the most realistic. By maximizing the marginal probability of the explanation, we also ensure the desired minimality. This is because, for subsets z1 and z2 of an instance x, if z1 z2 then Pr(z1) Pr(z2). We thus arrive at the following deﬁnition. Deﬁnition 5 (Most Likely Sufﬁcient Explanations). Given a predictor f, a distribution Pr, and an instance x, the most likely sufﬁcient explanations for x are given by:

MLSEk(x) = argmax z SEk(x) Pr(z)

Next, we devise a search algorithm for ﬁnding sufﬁcient explanations. In Section 6, we show that our algorithm works well in practice, ensuring both sufﬁciency and simplicity.

Finding Probabilistic Sufﬁcient Explanations To ﬁnd the most likely sufﬁcient explanations, we use a beam search algorithm to greedily search the space of potential explanations. We do so by keeping track of the top b candidates (the beam) for explanations for each cardinality based on their expected predictions and marginal probabilities. In more detail, we begin at level zero with the empty candidate, i.e. no features selected. For each subsequent level, we expand the top b candidates of the previous level by considering all feature subsets with one more feature than in the previous level, formed by adding a previously unselected feature to each candidate. We then select the new top b candidates by ranking the expanded states based on EP, breaking

Proceedings of the Thirtieth International Joint Conference on Artiﬁcial Intelligence (IJCAI-21)

ties using Pr, and keeping the top b. The search stops after level k. At the same time we keep track of the most likely candidate satisfying the current maximum EP. Due to the nature of the search, each level of the beam search is highly parallelizable, which can be leveraged to speed up the search. Additionally, by running the algorithm for k levels, we can also keep track of the best sufﬁciency guarantee at each level between 1 and k. Hence, with a little extra book keeping we can keep track of explanations with different sizes, and thus degrees of simplicity, and their corresponding sufﬁciency guarantees. Note that the search framework can also be easily adapted for different use cases. For example, we can provide a ﬁxed sufﬁciency constraint and then ﬁnd the simplest and most likely explanation with an expected prediction higher than the sufﬁciency threshold. However, choosing a good threshold for expected prediction is not always straightforward.

6 Experiments

In this section, we provide several experiments to showcase the effectiveness of our search algorithm in ﬁnding sufﬁcient explanations.3 Additionally, we aim to highlight the advantages of our method in comparison with Anchors and logical explanation methods. More speciﬁcally, we would like to answer the following questions:

Can we ﬁnd explanations with good sufﬁciency guarantees? How do they compare with Anchors?

Does relaxation from logical to probabilistic guarantees lead to much simpler explanations?

What are the tradeoffs between different sufﬁciency levels and explanation complexity?

We use the adult and MNIST datasets [Kohavi, 1996; Yann et al., 2009] for our experiments. For each dataset, we model the feature distribution by learning a probabilistic circuit (PC) using the open source Juice library [Dang et al., 2021]. We choose decision forests learned by XGBoost [Chen and Guestrin, 2016] as our classiﬁer, as they are popular and because expected prediction is tractable for forests w.r.t. PCs [Khosravi et al., 2020]. For more detailed information on the datasets, preprocessing steps, learned models, and computing infrastructure, please refer to the appendix.

6.1 Comparison with Anchors

To demonstrate the scalability of our method and showcase some advantages of our method in comparison with Anchors, we ran our algorithm on the binarized MNIST dataset with a binary classiﬁcation task of distinguishing between 3s and 5s. Some images are shown in Figure 1 along with a comparison between explanations found using our method and Anchors. For Anchors, we used an SDP (precision) threshold of 0.95, a tolerance (δ) of 0.05, and a beam size of 5. On average it took Anchors 454s to generate explanations. Our algorithm with the same beam size and cardinality constraint k = 30 took 347s using 16 threads; the sufﬁcient explanations with

3Code at github.com/UCLA-Star AI/Sufﬁcient Explanations

(a) Correctly classiﬁed examples

(b) Misclassiﬁed examples

Figure 1: Explanations for selected MNIST images. From left to right: 1) original image; 2) Anchors explanation; 3) our explanation with same number of features 4) our explanation with k = 30. Gray pixels were not chosen for the explanation. Pixels chosen for the explanation are colored the same color as the original image.

Method |EPO(z)| SDPC,x(z) log P(z) Anchors 0.75 0.37 0.66 0.08 3.29 0.88 MLSEs 1.57 0.29 0.86 0.05 3.05 0.65 MLSE10 3.11 0.23 0.99 0.01 6.98 1.37 MLSE20 3.60 0.15 1.00 0.00 9.90 2.14 MLSE30 3.75 0.13 1.00 0.00 11.77 2.88

Table 1: Comparison of average expected log-odds, SDP, and marginals between Anchors and MLSE, averaged over 50 random MNIST test images. We take the absolute value of EPO(z) to measure conﬁdence of the explanations (since it could be negative). MLSEs sets the cardinality constraint to the same size of the Anchors explanation for each image. The denotes one standard deviation. The SDP values are approximated.

the same size as Anchors took 45s. See appendix for more details on the run-times. The last image of each row in Figure 1 is the explanation found using our algorithm with cardinality constraint k = 30. We see that these explanations were able to pick up on certain features we would naturally use to distinguish between 3s and 5s. In particular the chosen pixels were mostly in the upper portion of each image. This makes sense as both 3s and 5s have a similar arch shape in their lower portions, so the upper portion would be more useful for distinguishing between the two. Additionally, the explanations contain not only some white pixels showing an outline of the predicted number, but also some black pixels, where a number of the opposite label may be present. Finally, by looking only at the explanation in the rightmost column of the last two rows we can guess that the classiﬁer will misclassify those examples. Table 1 provides data for the expected log-odds, SDP, and marginal probabilities for generated explanations using Anchors and our method in a few different scenarios. Since SDP

Proceedings of the Thirtieth International Joint Conference on Artiﬁcial Intelligence (IJCAI-21)

Figure 2: For sufﬁcient explanations (sizes 1-30), we plot SDP estimates (green) vs SDP lower bounds calculated based on expected log-odds (blue), and lower bounds based on approximate expected prediction (red), averaged over 50 test images of MNIST. Shaded regions represent one standard deviation.

is intractable to compute exactly, we estimate it by computing the SDP on 10000 samples drawn from the probabilistic circuit conditioned on the explanation. One advantage of using PCs as our generative model is that drawing conditional samples from Pr( | z) is very fast. For example, generating 10K samples takes 1 second. We see that the Anchors are quite overconﬁdent, giving explanations with much lower sample SDPs compared to the desired 0.95. This trend was also observed in Ignatiev et al. [2019b]. We plot the sample SDPs for our explanations, along with lower bounds calculated using Theorem 1, in Figure 2. The green line shows that, even when optimizing the EP of our explanations, the SDP still tends to be very high. Moreover, the blue line shows that the simple and efﬁcient SDP lower bound can also provide this guarantee for some of the larger explanations with high EP. The red line is yet another way to estimate the SDP lower bound. See appendix for more details.

6.2 Comparison with Logical Explanations

We also compared explanations found using our method to logical explanations, i.e. minimal explanations with SDP = 1. We used the adult dataset, which has much fewer features, for this task in order for the logical explanation computation to become feasible. We removed some features to allow for brute force computation of minimum cardinality logical explanations. We chose to use brute force because, as far as we know, there is no method for ﬁnding logical explanations for our use case. In total we removed 3 features, leaving us with 11 features. From our brute force search on some test examples, we found that logical explanations needed on average 39% of the features in order reach an SDP of one. By using our algorithm of maximizing the expected prediction, we found that in most cases selecting only 18% of the features was already enough to guarantee an SDP of 0.95 on average. More detailed numbers are provided in Table 2. We see that the strict requirement imposed by logical explanation methods can lead to selecting more features leading to more complex explanations. We expect this gap between the complexities of logical and probabilistic explanations to widen for datasets with more features or more complex models.

Method SDPC,x(z) log P(z) size Logical 1.0 7.12 2.11 4.30 1.13 Anchors 0.98 0.02 4.27 2.61 2.02 1.26 MLSEs 0.97 0.03 3.85 2.37 2.17 1.18 MLSE1 0.88 0.19 2.23 1.34 1.0 0.0 MLSE2 0.95 0.08 3.88 1.88 2.0 0.0 MLSE3 0.98 0.05 4.77 2.31 2.99 0.06 MLSE4 0.99 0.03 5.63 2.59 3.96 0.22

Table 2: SDP, marginal probability, and size statistics for explanations found using different methods for the adult dataset.

Figure 3: Tradeoff between expected prediction and marginal probability for MLSEs. The ﬁrst plot is for positive label images (5s); the second is for negative label images (3s). Expected predictions were computed using a ﬁrst order approximation.

6.3 Tradeoffs between Sufﬁciency and Simplicity Finally, we examine the tradeoff between maximizing sufﬁciency and explanation simplicity. We compare the expected predictions with marginal likelihoods of explanations. Figure 3 shows a scatter plot comparing these two quantities for different explanations generated for 50 MNIST images. As we see, the general trend is that by enforcing less strict sufﬁciency requirements we can get more likely (and also smaller) explanations. In particular, the trend is very steep around EPs of 0 and 1, meaning that making the guarantees even slightly probabilistic will lead to signiﬁcant simpliﬁcation of the explanations, thus validating our probabilistic approach.

7 Conclusion

We introduced a new framework for reasoning about the local behavior of classiﬁers. We formulated the problem as ﬁnding simplest and most likely explanations that maximize a probabilistic guarantee, and discussed advantages of our framework compared to model agnostic and logical explanation methods. We provided experiments to validate our claims. We conclude that probabilistic sufﬁcient explanations are a valuable addition to the arsenal of local explanation methods.

Acknowledgments

The authors would like to thank Yoo Jung Choi for helpful discussions regarding SDP. This work is partially supported by NSF grants #IIS-1943641, #IIS-1633857, #CCF-1837129, DARPA grant #N66001-17-2-4032, a Sloan Fellowship, Intel, and Facebook.

Proceedings of the Thirtieth International Joint Conference on Artiﬁcial Intelligence (IJCAI-21)

References [Camburu et al., 2020] Oana-Maria Camburu, Eleonora Giunchiglia, Jakob Foerster, Thomas Lukasiewicz, and Phil Blunsom. The struggles of feature-based explanations: Shapley values vs. minimal sufﬁcient subsets. ar Xiv, 2020. [Chen and Guestrin, 2016] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In KDD, 2016. [Chen et al., 2012] Suming Chen, Arthur Choi, and Adnan Darwiche. The same-decision probability: A new tool for decision making. In PGM, 2012. [Chen et al., 2013] Suming Jeremiah Chen, Arthur Choi, and Adnan Darwiche. An exact algorithm for computing the same-decision probability. In IJCAI, 2013. [Choi and Van den Broeck, 2018] Yoo Jung Choi and Guy Van den Broeck. On robust trimming of bayesian network classiﬁers. In Proceedings of the 27th International Joint Conference on Artiﬁcial Intelligence (IJCAI), July 2018. [Choi et al., 2012] Arthur Choi, Yexiang Xue, and Adnan Darwiche. Same-decision probability: A conﬁdence measure for threshold-based decisions. International Journal of Approximate Reasoning, 53(9):1415 1428, 2012. [Choi et al., 2017] Yoo Jung Choi, Adnan Darwiche, and Guy Van den Broeck. Optimal feature selection for decision robustness in bayesian networks. In Proceedings of IJCAI, August 2017. [Choi et al., 2020] Yoo Jung Choi, Antonio Vergari, and Guy Van den Broeck. Probabilistic circuits: A unifying framework for tractable probabilistic models. 2020. [Dang et al., 2021] Meihua Dang, Pasha Khosravi, Yitao Liang, Antonio Vergari, and Guy Van den Broeck. Juice: A julia package for logic and probabilistic circuits. In AAAI Conference (Demo Track), 2021. [Darwiche and Hirth, 2020] Adnan Darwiche and Auguste Hirth. On the reasons behind decisions. In European Conference on Artiﬁcial Intelligence (ECAI), 2020. [Devos et al., 2020] Laurens Devos, Wannes Meert, and Jesse Davis. Additive tree ensembles: Reasoning about potential instances. ar Xiv, 2020. [Dimanov et al., 2020] Botty Dimanov, Umang Bhatt, Mateja Jamnik, and Adrian Weller. You shouldn t trust me: Learning models which conceal unfairness from multiple explanation methods. In ECAI, 2020. [Guidotti et al., 2018] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models. ACM Comput. Surv., 2018. [Ignatiev et al., 2019a] Alexey Ignatiev, Nina Narodytska, and Joao Marques-Silva. Abduction-based explanations for machine learning models. Proceedings of the AAAI Conference, 2019. [Ignatiev et al., 2019b] Alexey Ignatiev, Nina Narodytska, and Joao Marques-Silva. On validating, repairing and reﬁning heuristic ml explanations. ar Xiv, 2019.

[Ignatiev, 2020] Alexey Ignatiev. Towards Trustable Explainable AI. In IJCAI, 7 2020. [Izza et al., 2020] Yacine Izza, Alexey Ignatiev, and Joao Marques-Silva. On explaining decision trees. ar Xiv, 2020. [Khosravi et al., 2019a] Pasha Khosravi, Yoo Jung Choi, Yitao Liang, Antonio Vergari, and Guy Van den Broeck. On tractable computation of expected predictions. In Advances in Neural Information Processing Systems, 2019. [Khosravi et al., 2019b] Pasha Khosravi, Yitao Liang, Yoo Jung Choi, and Guy Van den Broeck. What to expect of classiﬁers? reasoning about logistic regression with missing features. In Proceedings of IJCAI, Aug 2019. [Khosravi et al., 2020] Pasha Khosravi, Antonio Vergari, Yoo Jung Choi, Yitao Liang, and Guy Van den Broeck. Handling missing data in decision trees: A probabilistic approach. In The Artemiss Workshop at ICML, 2020. [Kohavi, 1996] Ron Kohavi. Scaling up the accuracy of naive-bayes classiﬁers: A decision-tree hybrid. In KDD, 1996. [Liang and Van den Broeck, 2019] Yitao Liang and Guy Van den Broeck. Learning logistic circuits. In Proceedings of the 33rd AAAI Conference, 2019. [Lundberg and Lee, 2017] Scott M Lundberg and Su-In Lee. A uniﬁed approach to interpreting model predictions. Neur IPS, 2017. [Marques-Silva et al., 2020] Joao Marques-Silva, Thomas Gerspacher, Martin C. Cooper, Alexey Ignatiev, and Nina Narodytska. Explaining naive bayes and other linear classiﬁers with polynomial time and delay. In Neur IPS, 2020. [Renooij, 2018] Silja Renooij. Same-decision probability: threshold robustness and application to explanation. In Proceedings of the Ninth International Conference on Probabilistic Graphical Models (PGM). PMLR, 2018. [Ribeiro et al., 2016] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. why should i trust you? : Explaining the predictions of any classiﬁer. In KDD, August 2016. [Ribeiro et al., 2018] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Anchors: High-precision modelagnostic explanations. In Proceedings of AAAI, 2018. [Shi et al., 2020] Weijia Shi, Andy Shih, Adnan Darwiche, and Arthur Choi. On Tractable Representations of Binary Neural Networks. In Proceedings of the 17th KR, 2020. [Shih et al., 2018] Andy Shih, Arthur Choi, and Adnan Darwiche. A symbolic approach to explaining bayesian network classiﬁers. In Proceedings of IJCAI, 2018. [Slack et al., 2020] Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2020. [Yann et al., 2009] Le Cun Yann, Cortes Corinna, and Christopher JC Burges. The MNIST database of handwritten digits, 2009.

Proceedings of the Thirtieth International Joint Conference on Artiﬁcial Intelligence (IJCAI-21)