# model_agnostic_multilevel_explanations__58710b1d.pdf Model Agnostic Multilevel Explanations Karthikeyan Natesan Ramamurthy, Bhanukiran Vinzamuri, Yunfeng Zhang, Amit Dhurandhar IBM Research, Yorktown Heights, NY USA 10598 knatesa@us.ibm.com, bhanu.vinzamuri@ibm.com, {zhangyun, adhuran}@us.ibm.com In recent years, post-hoc local instance-level and global dataset-level explainability of black-box models has received a lot of attention. Lesser attention has been given to obtaining insights at intermediate or group levels, which is a need outlined in recent works that study the challenges in realizing the guidelines in the General Data Protection Regulation (GDPR). In this paper, we propose a meta-method that, given a typical local explainability method, can build a multilevel explanation tree. The leaves of this tree correspond to local explanations, the root corresponds to global explanation, and intermediate levels correspond to explanations for groups of data points that it automatically clusters. The method can also leverage side information, where users can specify points for which they may want the explanations to be similar. We argue that such a multilevel structure can also be an effective form of communication, where one could obtain few explanations that characterize the entire dataset by considering an appropriate level in our explanation tree. Explanations for novel test points can be cost-efficiently obtained by associating them with the closest training points. When the local explainability technique is generalized additive (viz. LIME, GAMs), we develop fast approximate algorithm for building the multilevel tree and study its convergence behavior. We show that we produce high fidelity sparse explanations on several public datasets and also validate the effectiveness of the proposed technique based on two human studies one with experts and the other with non-expert users on real world datasets. 1 Introduction A very natural and effective way to communicate is to first provide high level general concepts and then only dive into more of the specifics [1]. In addition, the transition from high level concepts to more and more specific explanations should ideally be as logical or smooth as possible [2, 3]. For example, when you call a service provider there is usually an automated message trying categorize the problem at a high level followed by more specific questions. Eventually if the issue is not resolved a human representative may intervene to delve into further details. In such cases, information or explanations you provide at multiple levels enables others to obtain insights that are otherwise opaque. Recent work [4] has stressed the importance of having such multilevel explanations to successfully meet the requirements of Europe s General Data Protection Regulation (GDPR) [5]. They argue that simply having local or global explanations may not suffice for providing satisfactory explanations in many cases. In fact, even in the widely participated FICO explainability challenge [6] it was expected that one provides not just local explanations but also insights at the intermediate class level. Motivated by this need, in this paper, we propose a novel model agnostic multilevel explanation (MAME) method that takes as input a post-hoc local explainability technique for black-box models (e.g. LIME [7]) and an unlabeled dataset. The method then generates multiple explanations for each of the examples corresponding to different degrees of cohesion (i.e. parameter tying) between explanations of the examples. This explicitly controllable degree of cohesion determines a level in our multilevel explanation tree. In addition, we constrain that the predictions of the explanation model to be close to that of the black-box at each tree node, ensuring fidelity. At the extremes, the 34th Conference on Neural Information Processing Systems (Neur IPS 2020), Vancouver, Canada. Figure 1: Illustration of multilevel explanations generated by MAME for an industrial pump failure dataset consisting of 2500 wells. We show three levels: the bottom level (four) leaves which correspond to example local explanations, the top level corresponds to one global explanation and an intermediate level corresponds to explanations for two groups highlighted by MAME. Based on expert feedback, these intermediate explanations, although explaining the same type of pump, had semantic meaning as they corresponded to different manufacturer groups that behave noticeably differently. leaves would correspond to independent local explanations as with methods like LIME, while the root would correspond to a single global explanation given the high degree of cohesion. An illustration of this is given in Figure 1, where multilevel explanations were generated by MAME for a real industrial pump failure dataset (see Section 4.4 for details). We show three levels: the four leaves correspond to example local explanations (amongst many), the root corresponds to one global explanation and an intermediate level corresponds to explanations for two groups highlighted by MAME. Note that levels are numbered from 1 (leaves) increasing up to the highest value (root). The dotted lines indicate that the nodes are descendants of the node above, but not direct children. Based on expert feedback these intermediate explanations correspond to pumps having different manufacturers resulting in noticeable difference in behaviors. Also note that each level provides distinct enough information not subsumed by just local or global explanations, thus motivating the need for such multilevel explanations. Such explanations can thus help identify key characteristics that bind together different examples at various levels of granularity. They can also provide exemplar based explanations based on the groupings at specific levels. These are provided in the supplement. Our method can also take into account side information such as similarity in explanations based on class labels or user specified groupings based on domain knowledge for a subset of examples. Moreover, one can also use non-linear additive models going beyond LIME to generate local explanations. Our method thus provides a lot of flexibility in building multilevel explanations that can be customized apropos a specific application. We prove that our method actually forms a tree in that examples merged in a particular level of the tree remain together at higher levels. The proposed fast approximate algorithm for obtaining multilevel explanations is proved to converge to the exact solution. We show that we produce high fidelity sparse explanations on several public datasets. We also validate the effectiveness of the proposed technique based on two human studies one with experts and the other with non-expert users on real world datasets. 2 Related Work The most traditional direction in explainable AI is to directly build interpretable models such as rule lists or decision sets [8, 9] on the original data itself so that no post-hoc interpretability methods are required to uncover the logic behind the proposed actions. These methods however, may not readily give the best performance in many applications compared to more sophisticated black-box models. Some works try to provide local [7, 10, 11, 12] as well as global explanations that are both feature and exemplar based [13]. The method proposed by Plumb et al. [13] provides only local explanations and detects global patterns, but does not automatically identify and provide explanations for groups of data. Exemplar based explanations [14, 15] identify few examples that are representative of a larger dataset. Previous works tend to use distinct models to provide the local and global explanations. Hence, consistency between these models could potentially be an issue. Pedreschi et al. [16] propose a high-level approach to cluster local rule-based explanations to learn a local to global tree structure. The authors do not explicitly provide any algorithm and the drawback here is that the explanations at levels other than local may not have high fidelity to the black-box. In Tree Explainer [17] the authors present five new methods to combine efficiently computed exact local SHAP explanations which also additionally account for local feature interactions to understand global behavior of the model. Drawbacks are similar to [16]. Tsang et al. [18] propose a hierarchical method to study the change in the behavior of interactions for a local explanation from an instance level to across the whole dataset. A key difference is that they do not fuse explanations as we do as they go higher up the tree. Moreover, their notion of hierarchy is based on learning higher order interactions between the prediction and input features, which is different from ours. Bhatt et al. [19] propose a method to aggregate explanation functions based on various desiderata such as complexity, faithfulness, etc. Lakkaraju et al. [20] present a model agnostic approach to customize local explanations to an end-user defined feature set of interest. In contrast, MAME allows the end user to specify which explanations are similar apriori and also controls for complexity (sparsity) to learn high fidelity explanations. Our approach has relations to convex clustering [21, 22], and its generalizations to multi-task learning [23, 24]. However, our goal is completely different (multilevel post-hoc explainability) and our methodology of computing and using local models that mimic black-box predictions is also different. Let X Y denote the input-output space and f : X Y be a classification/regression function corresponding to a black-box model. Let g(.) be a potentially non-linear map on feature vector x X. Let p denote the number of features and n the number of instances. For the parameter vector θ Rp, l(x, θ) = g(x)T θ is a generalized additive model, g(.) being a pre-specified map. We can learn this from the predictions of f(.) for examples near x. This provides a local explanation for x given by θ. The similarity, ψ(x, z), between x and z, can be estimated as exp( d(x, z)/η), d(., .) being the distance function, and η being the bandwidth. Let {(x1, y1), . . . , (xn, yn)} be a dataset of size n, where yi may or may not be known for each i. For each xi, we define the neighborhood Ni = {z X|ψ(xi, z) κ}, κ close to 1. In practice, Ni of size m can be generated by randomly perturbing the instance (xi) m number of times as done in previous works [7] . We now define the optimization: z Ni ψ (xi, z) f(z) g(z)T θi 2 + αi||θi||1 + β i 1.0) for the next k. We will denote these approximate solutions as Θ(k), where k corresponds to the index of the set of β values. The detailed algorithm for obtaining the multilevel tree using MAME is described in Algorithm 1. This procedure results in a single tree because a single Union operation merges two child nodes (and the sub-trees for which they are root) to create a parent node. Merges only happen above and there will be no cycles hence. In this algorithm, when the stopping criteria on V is met, a single global model for the entire dataset is obtained (represented by the root of the tree). Step (v) creates runs LASSO on the group of examples in each node to get post-processed representative explanations. Since the algorithm implements the AR approach, γ(k) is Algorithm 1 Model Agnostic Multilevel Explanation (MAME) method Input: Dataset x1, ..., xn, black-box model f(.), the coordinate wise map g(.) and prior knowledge graph adjacency matrix W. i) Sample neighborhoods Ni for each example xi. ii) Construct matrix D based on edge list E. iii) Initialize Θ(0), U (0),V (0) and set k = 0, multiplicative step-size t (say to 1.01), γ(0) = ϵ (say to 1e 10), grouping threshold τ (say to 1e 6), ρ (say to 2), and tol. (say to 1e 6). iv) Initialize a disjoint set with leaves of the multilevel the tree S = {1, . . . , n}. while V k+1 > tol. do Obtain Θ(k+1) by solving (4), U (k+1) by solving (5), V (k+1) by solving (6) with β set as γ(k). Obtain Z(k+1) 1 , Z(k+1) 2 by solving (7) and (8). For each edge el = (i, j) E, if vl < τ, perform Union(i, j). k = k + 1; γ(k+1) = γ(k) t. end while iv) Recover the multilevel tree by keeping track of the disjoint set unions. v) For every group of examples in each tree node, post-process to get representative explanations by optimizing (1) with β = 0 only for the examples in the group. used as a surrogate notation for β (which is used in the exact solution). The dominant complexity for pass of the while loop in Algorithm 1 is O(pn(p + n)). Detailed computational complexity analysis is provided in the supplement. The exact solution where (4)-(8) are run until convergence for each β value, will be denoted as Θ(β). We show that the approximate solution converges to the exact one (proof in supplement). This theorem holds true for {Θ(k)} without the post-processing step (v) in Algorithm 1. Theorem 3.1. As (t, ϵ) (1, 0), where t is the multiplicative step-size update and ϵ is the initial regularization level, the sequence of AR-based primal solutions {Θ(k)}, and the sequence of exact primal solutions {Θ(β)} converge in the following sense. inf k Θ(k) Θ(β) , Ek inf β Θ(k) Θ(β) o (t,ϵ) (1,0) 0 (9) Comparisons of approximation quality and timing between the two solutions are in supplement. We now show that our method actually forms a tree in that explanations of examples that are close together at lower levels will remain at least equally close at higher levels (proof in supplement). Lemma 3.2 (Non-expansive map for exact solutions). If β1, ..., βk are regularization parameters for the last term in (1) for r consecutive levels in our multilevel explanations where β1 = 0 is the lowest level with θi,s and θj,s denoting the (globally) optimal coefficient vectors (or explanations) for xi and xj respectively corresponding to level s {1, ..., r}, then for s > 1 and wij > 0 we have θi,s θj,s 2 θi,s 1 θj,s 1 2. 4 Experiments We evaluate our method on three different cases with two baselines (described below): (a) Two Step, and (b) Submodular Pick LIME (SP-LIME) [7]. We set g(.) to be an identity map for numerical features and one-hot encoding for categorical features in all the experiments for ease of demonstration, though it can be set to any appropriate non-linear map by the user. We first show quantitative benefit of our method in terms of two measures defined in Section 4.2 on several public datasets. The MAME approach has better explanation fidelity, and the explanations are better matched to the important features of the black-box, when compared to the baselines. Secondly, we conducted a study with data scientist users who were not experts in finance using a public loan approval dataset. We found that our method was significantly better insights to these non-experts compared to the baselines. Data scientists are the right catcher for our method as a recent study [26] claims that data scientists are the first consumers of explanations from AI based in most organizations. Finally, we performed a case study involving human experts in the Oil & Gas industry. Insights provided by MAME were semantically meaningful to the experts both when they did and did not provide additional side information. 4.1 Baselines and Miscellaneous Details Two Step: This is a hierarchical convex clustering [21, 22] of the local LIME explanations for n instances, given by Ω Rp n. The objective is written as min Θ Ω Θ 2 F +β P i