# compile__2f625864.pdf Pierre Marquis CRIL-CNRS, Universit e d Artois, France marquis@cril.univ-artois.fr This paper is concerned with knowledge compilation (KC), a family of approaches developed in AI for more than twenty years. Knowledge compilation consists in pre-processing some pieces of the available information in order to improve the computational efficiency (especially, the time complexity) of some tasks. In this paper, the focus is laid on three KC topics which gave rise to many works: the development of knowledge compilation techniques for the clausal entailment problem in propositional logic, the concept of compilability and the notion of knowledge compilation map. The three topics, as well as an overview of the main results from the literature, are presented. Some recent research lines are also discussed. Introduction Pioneered more than two decades ago, knowledge compilation (KC) (see (Cadoli and Donini 1998) for a survey) appears nowadays as a very active field in AI. In KC one is basically concerned with the computation of a recursive function f, which represents the task to be solved. f takes two arguments: a fixed part F and a varying part V . The objective is to improve the time needed to compute f(F, V ), the image of F, V by f, when V varies. To reach this goal, the fixed part F is pre-processed during an off-line phase: F is turned into a compiled form C(F) using a compilation function C. In some situations, such a pre-processing is computationally useful: for some p the (cumulated) time needed to compute C(F), f(C(F), V1), . . ., f(C(F), Vp) can be lower (sometimes by orders of magnitude) than the (cumulated, on-line) time needed to compute f(F, V1), . . ., f(F, Vp). It must be noted that the assumption that some pieces of information are not very often subject to change (a fixed part exists) is valid in many scenarios. For instance, in a computer-aided product configuration problem (which will be considered as a running example ), the objective is to help the customer to find out a product (e.g., a car) by making some elementary choices (e.g., the number of seats, the engine, the color of the car). What makes this task not obvious is that, on the one hand, not all the combinations of elementary choices correspond to a feasible product (due to production or marketing constraints) and on the other hand, Copyright c 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. the product catalog is so large that it cannot be represented extensionally. Here, F may represent intensionally the feasible products and each Vi represents some user choices; F is independent of the Vi; among the tasks of interest, a basic one consists in determining whether the current choices Vi of the user are compatible with F, i.e., whether there exists a product which satisfies those choices (if the answer is negative, then the user must be aware of it and has to question some of her choices). In KC, knowledge must be taken in a rather broad sense: F may represent any piece of information. In AI, F typically corresponds to beliefs or to preferences, but pieces of information of other types could be considered as well. Obviously, beliefs and preferences can be modeled in many different ways; most of the existing work focus on propositional statements (in classical logic) and utility functions, respectively. Unsurprisingly, due to the very nature of the input, the tasks to be achieved typically amount to combinations of elementary inference or optimization processes. For instance, in a configuration problem, one needs to be able to provide the user with the direct consequences of the choices she makes and to maintain the minimal and maximal costs of the products compatible with the current choices. In such applications where guaranteed response times are expected, knowledge compilation appears as very challenging. Indeed, offering some guarantee on the response time is a first-class requirement for many programs which must make on-line decisions under strong time constraints. Especially, this is the case for a number of Web-based applications (the user is typically not ready to wait a long time for getting the information she is looking for). A standard scenario for which some KC approaches have been developed is the one where F is a database (or a knowledge base), each Vi is a query, and f(F, Vi) is the answer to Vi given F. More specifically, one is often interested in the case when F is a (classical) propositional formula, Vi is a propositional clause (or a propositional CNF formula), and f corresponds to the clausal entailment problem CE, i.e., f is the characteristic function of the language {(F, Vi) | F |= Vi}. This problem occurs in many AI applications. Indeed, since F |= Vi holds iff F Vi is consistent, determining whether the user choices ( Vi) are consistent with F reduces to the clausal entailment problem. In particular, such a task is of the utmost value for product Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence configuration: the models of F represents the feasible products and each time the user makes some choices one must be able to determine efficiently whether such choices can be extended to a feasible product. Because the clausal entailment problem is co NP-complete in propositional logic when F is a CNF formula, there is no available polynomial-time algorithm for solving it (i.e., for computing f) in this case, and it is conjectured that no such algorithm is possible (this would imply P = NP). As a consequence, in the general case there is no polynomial bound on the time required to compute f(F, V ) whatever the algorithm used. Especially, while modern SAT solvers prove efficient in many cases, taking account for the partial assignment corresponding to V may reduce drastically the set of solutions, so that F V has very few models and those models are hard to be found. Thus, no guaranteed response times can be ensured in the general case when f(F, V ) is computed as sat(F V ). A fundamental question is the assessment of a KC method: when does a KC approach C achieve its goal? . This is not an easy question. Indeed, the set of all possible Vi which will be considered during the on-line phase is typically unknown and often too large for being taken into account exhaustively. Hence one needs to select a subset of significant Vi and check whether the cumulated time needed to compute C(F) and f(C(F), Vi) for each significant Vi is lower than the cumulated time needed to compute f(F, Vi) for each significant Vi. Several difficulties must be overcome: how to select the Vi used for assessing the method? Which algorithm should be considered as a base line for computing f(F, Vi)? Instead of looking at the assessment problem at the instance level (i.e., for some given Vi), one can also consider it at the problem level (i.e., whatever V ). This leads to the concept of compilability, see e.g., (Cadoli et al. 2002). Under this view, provided that f is a pseudo-Boolean function (corresponding to a decision problem encoded as usual as the formal language of its positive instances), compiling F using C is considered as valuable if the problem which consists in deciding whether f(C(F), V ) = 1 for any admissible V is computationally easier than the problem which consists in deciding whether f(F, V ) = 1 for any admissible V . C is required to be a polynomial-size function (i.e., for some fixed polynomial, the size of C(F) is polynomial in the size of F). Computationally easier typically means that the decision problem when F is compiled using C belongs to a complexity class C located below in the polynomial hierarchy than the same problem when F has not been compiled. Of special interest is the case when C = P, i.e., the problem becomes tractable provided that the fixed part of the input is C(F). In such a case, the problem is said to be compilable to P. It must be noted that the polynomialsize condition on C is a mandatory requirement for avoiding an unbiased evaluation. Nevertheless, evaluating a KC approach C at the problem level also has its cons: it can prove too coarse-grained to give an accurate view of the improvements offered by KC for a given application. Indeed, an application corresponds to a specific instance of a problem (or to a finite set of such instances), but not to the problem itself (where the focus is laid on the worst case and arbitrarily large instances are considered). Thus, it can be the case that KC proves useful in practice for solving some instances of interest of a problem, even if it does not make the problem itself less intractable or if it does not ensure that the size of C(F) is polynomial in the size of F. Another important issue in knowledge compilation is the choice of the language L into which the fixed part F must be compiled, i.e., the co-domain of the compilation function C. Often the domain-dependent task f to be achieved is the combination of a number of elementary subtasks, each of them modeled as a specific function fi. Each function corresponds either to a query (i.e., an operation which does not modify the input) or to a transformation (i.e. an operation which returns another L-representation). The choice of a target language L (which is reminiscent to a well-known issue in Computer Science: the choice of a data structure for a given abstract data type, made precise by the set of functions fi) relies on several criteria, including the expressiveness of L (i.e., its ability to represent the available information) and more precisely its spatial efficiency (i.e., its ability to do so using little space), as well as its time efficiency (i.e., the ability to compute the functions fi efficiently when the input is an L-representation). A knowledge compilation map (Darwiche and Marquis 2002) is a systematic, multi-criteria comparative analysis of languages which are candidates for the representation of C(F). A first map for propositional formulae / Boolean functions has been developed and enriched from then. The rest of the paper is organized as follows. First, I review some of the pioneering works on knowledge compilation, where the problem of interest is clausal entailment in propositional logic. Given its importance (as explained before), there is still some active research on the topic and some recent results based on the notion of empowering clause are sketched. Then the concept of compilability is formally defined and again a couple of properties presented. The next section is about the notion of knowledge compilation map. The concepts of queries and transformations, as well as those of expressiveness, spatial efficiency and time efficiency w.r.t. some queries and/or transformations are recalled. Some examples pertaining to the propositional case are provided. I also briefly describe some propositional languages which have been added recently to the propositional map. Finally, I sketch some hot off the press results about the compilation of valued CSPs (and similar representations), which can be used in AI for modeling various types of information, including probability distributions and utility functions. Of course, due to space reasons, some choices had to be made so that this paper is far from an exhaustive survey of the topic. Among other things, KC for closed-world reasoning and default reasoning, KC for modal logics/description logics, KC for lifted probabilistic inference, applications of KC to diagnosis, to planning, are all interesting topics which are not covered here. KC for Clausal Entailment One of the main applications of KC (at least from a historical perspective) is to enhance the efficiency of clausal entailment CE in propositional logic. One often makes a distinction between the approaches satisfying CE (called exact compilation methods ) and approaches ensuring that a (proper) subset of all clausal queries can be answered exactly in polynomial time ( approximate compilation methods ). In the first case, C is said to be equivalencepreserving. In the second case, the subset of queries under consideration is not explicitly given as part of the entailment problem; it can be intensionally characterized, for instance as the set of all clauses generated from a given set of variables or literals, all the Horn clauses, all the clauses from k-CNF (for a fixed parameter k), etc. It can also be unspecified at the start and derived as a consequence of the compilation technique under used. For instance, one can associate with any propositional formula F one of the logically weakest HORN-CNF formula fl(F) implying F as well as one of the logically strongest HORN-CNF formula fu(F) implied by F (Selman and Kautz 1996). While fu(F) is unique up to logical equivalence, exponentially many nonequivalent fl(F) may exist. Once fl(F) and fu(F) are computed, they can be used to answer in polynomial time every clausal query V s.t. fl(F) |= V (1) or fu(F) |= V (2) (due to the transitivity of |=); the answer is no in case (1), yes in case (2), and Horn (clausal) queries can be answered exactly using fl(L). Such approaches have been extended to other languages (see e.g., (Simon and del Val 2001)). Both families of methods (i.e., the exact methods and the approximate ones) include approaches based on prime implicates (or the dual concept of prime implicants), which have been used for a long time for the KC purpose. Like the conjunctive formulae F = F1 . . . Fn which are decomposable, i.e., the conjuncts do not share common variables, the prime implicates formulae (alias Blake formulae) satisfy a (first) separability property (the conjunctive one) stating that: F = F1 . . . Fn is conjunctively separable if and only if for any clause V , F |= V if and only if there exists i 1 . . . n s.t. Fi |= V . A similar (yet distinct) second separability property (the disjunctive one) is satisfied by all disjunctive formulae F: F = F1 . . . Fn is disjunctively separable, i.e., for any formula V , F |= V if and only if for every i 1 . . . n s.t. Fi |= V . Beyond the language PI of prime implicates formulae, those two separability properties underly many target languages for KC satisfying CE, especially the language DNNF (Darwiche 2001) of propositional circuits in Decomposable Negation Normal Form (and its subsets d-DNNF, OBDD<, DNF, MODS). This can be explained by considering Shannon decompositions of formulae. Indeed, the Shannon decomposition of a formula F over a variable x is a formula ( x (F | x)) (x (F | x)) equivalent to F exhibiting several separable subformulae: x (F | x) and x (F | x) are disjunctively separable; x and F | x are conjunctively separable; x and F | x are conjunctively separable. Performing in a systematic way the Shannon decomposition of a formula F over the variables occurring in it leads to formulae from languages satisfying CE. Using or not a fixed decomposition ordering gives rise to distinct languages. Since the DPLL procedure (Davis, Logemann, and Loveland 1962) for the satisfiability problem SAT can be used to perform such decompositions, several target languages for KC can be characterized by the traces achieved by this search procedure (Huang and Darwiche 2007). Many KC approaches to clausal entailment actually exploit the separability properties (see e.g., (Marquis 1995; Schrag 1996; Boufkhad et al. 1997)). Finally, other compilation techniques aim at exploiting the power of unit resolution so as to make clausal entailment tractable. Saturating a CNF formula F using unit resolution, i.e., computing the CNF formula obtained by removing the literal l in every clause of F whenever the unit clause l belongs to F, can be achieved in time linear in the size of F. When the empty clause is generated, a unit refutation from F exists. In order to make a CNF formula F unit-refutation complete (i.e., such that for every implicate V = l1 . . . lk of it, there is a unit refutation from F l1 . . . lk), an approach consists in adding some clauses (typically some prime implicates of F) to it. Computing all the prime implicates is useless; especially it is enough to focus on those which can be derived using merge resolution (del Val 1994). There has been recently a revival of the KC techniques for clausal entailment based on unit resolution, see e.g., (Bordeaux and Marques-Silva 2012; Bordeaux et al. 2012). This can be explained by the design for the past few years in the Constraint Programming community of some propagation-complete CNF encodings for several specific constraints. Basically, one wants to derive CNF encodings which are such that unit propagation is enough to derive every unit clause which is implied by the formula in any conjunctive context. In more formal terms, a CNF formula F is propagation-complete iff for every term l1 . . . lk and every literal l, if F l1 . . . lk |= l, then there exists a unit resolution proof of l from F l1 . . . lk. Clearly enough, if F is propagation-complete, then it is unit-refutation complete as well. In order to make F propagation-complete, it is enough to add empowering implicates to it: an implicate V = l1 . . . lk l of F is empowering when F l1 . . . lk |= l but there is no unit resolution proof of l from F l1 . . . lk. It has been proved that F has no empowering implicate (i.e., it is closed under empowerment) precisely when F is propagation-complete. Interestingly, the concept of empowering clause plays an important role for characterizing clause-learning SAT solvers (Pipatsrisawat and Darwiche 2011). Approaches to the compilation of CNF formulae by adding empowering clauses have been developed. Noticeably, there are families of CNF formulae which are closed under empowerment (hence propagationcomplete) but also have exponentially many prime implicates (even when considering only those produced by merge resolution). A Glimpse at Compilability For all the compilation functions C considered in the previous section, there exist families of formulae Fs of size s such that C(Fs) is of size exponential in s.1 While the clausal entailment problem can be solved in polynomial time from C(Fs), the computational benefits of KC is dubious in such a case since it is not guaranteed that the computational effort spent for compiling Fs can be balanced; indeed, due to the size of C(Fs) it can be the case that the time needed to determine whether a clause V is a logical consequence of C(Fs) exceeds the time needed to determine whether V is a logical consequence of Fs. Could a clever choice of C lead to overcome the problem? This question amounts to determining whether the decision problem corresponding to CE is compilable to P. As we will see, the answer to it is negative (under standard assumptions of complexity theory). Roughly speaking, a decision problem is said to be compilable to a given complexity class C if it is in C once the fixed part of any instance has been pre-processed, i.e., turned off-line into a data structure of size polynomial in the input size. As explained in the introduction, the fact that the preprocessing leads to a compiled form of polynomial size is crucial. In order to formalize such a notion of compilability, Cadoli and his colleagues introduced new classes (compilability classes) organized into hierarchies (which echo the polynomial hierarchy PH) and the corresponding reductions (see the excellent pieces of work (Liberatore 2001; Cadoli et al. 2002)). This enables to classify many AI problems as compilable to a class C, or as not compilable to C (usually under standard assumptions of complexity theory - the fact that PH does not collapse). Several families of classes can be considered as candidates to represent what compilable to C means. One of them is the family of comp C classes: Definition 1 Let C be a complexity class closed under polynomial many-one reductions and admitting complete problems for such reductions. A language of pairs L1 belongs to comp C if and only if there exists a polynomial-size function C 2 and a language of pairs L2 C such that (F, V ) L1 if and only if (C(F), V ) L2. Obviously enough, for every admissible complexity class C, we have the inclusion C comp C (choose C as the identity function). Note that no requirement is imposed on the time complexity of C (the function C can even be nonrecursive!). This leads to strong non-compilability results. In order to prove the membership to comp C of a problem it is enough to follow the definition (hence, exhibiting a polynomial-size compilation function C). Things are more complex to prove that a given problem does not belong to 1This is known for years for prime implicates; contrastingly, though a non-constructive proof was already known, the identification of a family of CNF formulae for which the number of clauses to be added for rendering it propagation-complete is recent (Babka et al. 2013). 2A function C is polynomial-size if and only if there exists a polynomial p such that |C(F)| is bounded by p(|F|) for every F in the domain of C. comp C (under the standard assumptions of complexity theory). A notion of comp-reduction suited to the compilability classes comp C has been pointed out, and the existence of complete problems for such classes proved. However, many non-compilability results from the literature cannot be rephrased as comp C-completeness results. For instance, it is unlikely that CE is compco NP-complete (it would make P = NP). In order to go further, one needs to consider the compilability classes nu-comp C (Cadoli et al. 2002): Definition 2 Let C be a complexity class closed under polynomial many-one reductions and admitting complete problems for such reductions. A language of pairs L1 belongs to nu-comp C if and only if there exists a binary polynomialsize function C and a language of pairs L2 C such that for all F, V L1, we have: F, V L1 if and only if C(F, |V |), V L2. Here nu stands for non-uniform , which indicates that the compiled form of F may also depend on the size of the varying part V . A notion of non-uniform comp-reduction suited to the compilability classes nu-comp C has also been pointed out (it includes the notion of (uniform) compreduction), as well as the existence of complete problems for such classes. For instance, the clausal entailment problem is nu-compco NP-complete. Inclusion of compilability classes comp C, nu-comp C similar to those holding in the polynomial hierarchy PH exist (see (Cadoli et al. 2002)). It is strongly believed that the corresponding compilability hierarchies are proper: if one of them collapses, then the polynomial hierarchy collapses at well (cf. Theorem 2.12 from (Cadoli et al. 2002)). For instance, if the clausal entailment problem CE is in nucomp P, then the polynomial hierarchy collapses. Accordingly, in order to show that a problem is not in comp C, it is enough to prove that it is nu-comp C -hard, where C is located higher than C in the polynomial hierarchy. Since complete problems for any nu-comp C class can be easily derived from complete problems for the corresponding class C of the polynomial hierarchy, nu-comp C-complete problems appear as a key tool for proving non-compilability results. Thus, the compilability of a number of AI problems, including diagnosis, planning, abduction, belief revision, belief update, closed-world reasoning, and paraconsistent inference from belief bases has been investigated from then. The Notion of KC Map Another important issue to be addressed in KC is the choice of a target language L, i.e., the representation language of compiled forms (L is the co-domain of C). Obviously, this is a domain-dependent issue since it depends on the tasks we would like to improve via KC, computationally speaking. However, as explained in the introductive section, many tasks can be decomposed into domain-independent basic queries and transformations fi, so that one can focus on such queries and transformations instead of the tasks themselves. Stepping back to the configuration problem, beyond determining whether the user choices correspond to a feasible product, one can be asked to count them, and when their number is small enough, the user can be interested in enumerating such products. Generally speaking, the choice of a target language L for KC is based on several criteria including: the expressiveness and the spatial efficiency (succinctness) of L; the time efficiency of L for the queries of interest; the time efficiency of L for the transformations of interest. Expressiveness is modeled by a pre-order over representation languages. It captures the ability of encoding information. Considering a language L which is not expressive enough for representing F may lead to a loss of information (the best we can do is then to approximate F in L, see the previous discussion about approximate compilation methods for the clausal entailment problem). Succinctness is a refinement of expressiveness which considers the representation sizes. The Propositional Case In the propositional case, expressiveness and succinctness are defined as follows: Definition 3 Let L1 and L2 be two propositional languages. L1 is at least as expressive as L2, denoted L1 e L2, iff for every formula α L2, there exists an equivalent formula β L1; L1 is at least as succinct as L2, denoted L1 s L2, iff there exists a polynomial p such that for every formula α L2, there exists an equivalent formula β L1 where |β| p(|α|). Observe that the definition of succinctness does not require that there exists a function C computing β given α in polynomial time; it is only asked that a polynomial-size function C exists. The succinctness criterion is important because the time complexity of any fi depends on the size of the input C(Fi); switching from F to an exponentially larger C(F) can be a way to obtain polynomial-time algorithms for some fi but actual benefits will be hardly reached (as a matter of example, just consider the compilation of CNF formulae into the MODS language). The KC map presented in (Darwiche and Marquis 2002) is suited to the case of classical propositional logic. Beyond clausal entailment CE, queries considered in it are tests for consistency CO (i.e., the SAT problem), validity VA, implicants IM, equivalence EQ, and sentential entailment SE. Model counting MC and model enumeration ME have also been considered. A number of transformations have also been taken into account in (Darwiche and Marquis 2002), especially conditioning CD, closures w.r.t. the n-ary connectives C (conjunction), C (disjunction), the unary one C (negation), and the corresponding bounded versions BC and BC. Forgetting FO (i.e., the elimination of existentially quantified variables) is another transformation which is important for many problems. Each query and each transformation corresponds to a specific function fi of interest. Queries and transformations are also viewed as properties satisfied (or not) by representation languages: roughly speaking, L is said to satisfy a given query / transformation fi when there exists a polynomial-time algorithm for computing fi provided that the input is in L; L is said not to satisfy fi if the existence of such an algorithm is impossible or if it would imply P = NP. The KC map reported in (Darwiche and Marquis 2002) gathers the multi-criteria evaluation of a number of propositional languages. As to the time efficiency, one can find positive results in it. For instance, DNNF satisfies CO, CE, ME, CD, FO, C. Negative results can also be found in the KC map, e.g., the fact that DNNF does not satisfy any of VA, IM, EQ, SE, CT, BC, C unless P = NP. Importantly, the KC map also indicates how languages compare one another w.r.t. expressiveness/succinctness. For instance, while DNNF, d-DNNF, OBDD< and MODS are equally (and fully) expressive (i.e., every propositional formula has a representation in any of those languages), they are not equally succinct; indeed, we have the following strict succinctness ordering: DNNF