# counting_maximal_satisfiable_subsets__f65415f3.pdf Counting Maximal Satisfiable Subsets Jaroslav Bend ık,1 Kuldeep S. Meel 2 1 Masaryk University, Brno, Czech Republic 2 National University of Singapore, Singapore xbendik@fi.muni.cz, meel@comp.nus.edu.sg Given an unsatisfiable set of constraints F, a maximal satisfiable subset (MSS) is a maximal subset of constraints C F such that C is satisfiable. Over the past two decades, the steady improvement in runtime performance of algorithms for finding MSSes has led to increased adoption of MSSbased techniques in a wide variety of domains. Motivated by the progress in finding an MSS, the past decade has witnessed a surge of interest in the design of algorithmic techniques to enumerate all the MSSes, which has subsequently led to a discovery of new applications utilizing enumeration of MSSes. The development of techniques for finding and enumeration of MSSes mirrors a similar phenomenon of finding and enumeration of SAT solutions in the early 2000s, which subsequently motivated the design of algorithmic techniques for model counting. In a similar spirit, we undertake a study to investigate the feasibility of MSS counting techniques. In particular, the focus point of our investigation is to answer whether one can design efficient MSS counting techniques that do not rely on explicit MSS enumeration. The primary contribution of this work is an affirmative answer to the above question in the form of a novel algorithm. The algorithm uses a novel architecture of a wrapper W and a remainder R such that the desired MSS count can be expressed as |W| |R|. To efficiently compute |W| and |R|, the algorithm relies on the advances in projected model counting. Our empirical evaluation demonstrates that our approach can scale to instances clearly beyond the reach of enumeration-based techniques. Introduction Logical constraints have emerged as a prominent representation language to model environments and agents. A set of constraints is called satisfiable if there exists an assignment to variables such that all the constraints are satisfied. Similarly, a set of constraints is unsatisfiable (also referred to as over-constrained) if there does not exist an assignment that would satisfy the constraints (Meseguer et al. 2003). If we are given an unsatisfiable set of constraints, the goal is often to analyze the unsatisfiability. In such cases, two notions are of particular interest: a Minimal Unsatisfiable Subset (MUS), which is a minimal subset of constraints that is unsatisfiable, and a Maximal Satisfiable Subset (MSS), which is a maximal subset of constraints that is satisfiable (Liffiton Copyright 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. and Sakallah 2008; Belov, Lynce, and Marques-Silva 2012; Reiter 1987). A dual notion to an MSS is that of a Minimal Correction Subset (MCS), which is a minimal subset of constraints that need to be relaxed. Formally, given a set of constraints F, a set C F is a MSS of F iff F \ C is a MCS of F. The design of modern AI systems often encounters overconstrained systems and in such scenarios, MCSes and MSSes often play a major role (Bailey and Stuckey 2005; Liffiton and Sakallah 2008). In particular, in the field of diagnosis of systems, an MCS represents the constraints that need to be relaxed for the system to be conflict-free. Similarly, in the context of a belief update and argumentation, an MSS plays a key role in the update of belief in the presence of an incoming contradictory belief. We refer the reader to (Besnard, Gr egoire, and Lagniez 2015) for further discussion. Often, finding a single MSS (resp. MCS) is not sufficient, and we seek to enumerate all MSSes or to count the number of MSSes. Especially, the MSS count serves as a good diagnostic metric (Thimm 2018; Hunter and Konieczny 2008), or as an indicator of the feasibility of the complete MSS enumeration. Our interest in counting the number of MSSes is motivated by the rise of Beyond NP paradigm wherein the progress in the design of efficient techniques for satisfiability paved the way for interest in the design of efficient techniques for problems such as counting, sampling, optimization, and the like. In particular, the past two decades have witnessed a proliferation of efficient techniques for model counting, also denoted as #SAT. It is worth remarking that initial studies into model counting were motivated by applications in Bayesian inference but the subsequent availability of efficient model counting techniques have now led to several new applications ranging from neural network verification (Baluta et al. 2019), quantified information flow (Biondi et al. 2018), computational biology (Sashittal and El-Kebir 2020), network reliability, and the like. In this regard, we view that given the availability of efficient techniques for finding an MSS/MCS, it is the right time to pursue an investigation into the design of efficient counting techniques for MSSes/MCSes, and the availability of efficient counting techniques for MCSes/MSSes would lead to a discovery of a diverse set of applications for MSS counting. Similarly to the early years of research into #SAT, the The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) best-known technique, as of now, to perform the MSS counting is to employ state of the art techniques for complete MSS enumeration. While MSS enumeration techniques have improved over the years, the complete MSS enumeration is often practically intractable since there can be up to exponentially many MSSes w.r.t. the size of the input constraint set. In this context, the primary research question that we seek to investigate is whether we can design MSS counting techniques that do not necessarily rely on enumeration?. We envision development of MSS counting techniques to take advantage of the progress in the model counting techniques. Given that the problem of finding an MSS is in FPNP[logn] (i.e., harder than the classical SAT problem), a natural target problem is projected model counting, a generalization of the classical model counting problem; the projected model counting is known to be #NP-hard in contrast to #P-completeness of classical model counting. The primary contribution of this paper is an affirmative answer to the above question. In particular, we design a new algorithmic framework that uses a novel architecture of a wrapper W and a remainder R such that the desired MSS count corresponding to the formula F is |W| |R|. We encode the wrapper W and the remainder R via Boolean formulas W and R with suitable projection sets such that the projected model count of W and R is equal to |W| and |R| respectively. We present four different strategies for the construction of wrappers (and their corresponding remainders ) and observe the soundness of a composition of different wrappers. The reduction to projected model counting allows us to build on recent advances in the design of efficient component caching-based projected model counting techniques. To demonstrate the empirical effectiveness of our approach, we implemented a Python-based prototype and performed a detailed empirical analysis. Out of 1200 benchmarks, the enumeration-based techniques can solve 353 benchmarks while our approach can solve 510 benchmarks. Prelimilaries and Problem Formulation We use standard definitions for propositional logic. A propositional formula F is built over a set of literals, where a literal is either a Boolean variable or its negation. Vars(F) denotes the set of variables used in F. A valuation π of a finite set A of variables is a mapping from A to {1, 0}. Vals(F) denotes the set of all valuations of Vars(F). Given a valuation π and a formula F, we write F[π] to denote the substitution of each variable x in the domain of π by the value π(x); furthermore, we apply trivial simplifications, e.g., G 0 = G, G 0 = 0, etc. Observe that if A Vars(F), then F[π] is simplified either to 1 or to 0. If A Vars(F) and F[π] = 1, we write π |= F and we call π a model of F; otherwise, if F[π] = 0, we write π |= F. A formula is satisfiable iff it has a model, and, otherwise, it is unsatisfiable. We write MF to denote the set of all models of a formula F. Furthermore, for a set A of variables such that A Vars(F), we write MF A to denote the projection of MF on A, and for π MF , we write π A to denote the projection of π on A. Two formulas F and G are equivalent, denoted F G, iff MF = MG. 1000 0100 0010 0001 1100 1010 0110 1001 0101 0011 1110 1101 1011 0111 Figure 1: Illustration of P(F) from the Example 1. We denote individual subsets of F as bit-vectors, e.g., {f1, f3} is written as 1010. The subsets with a dashed border are the unsatisfiable subsets, and the others are satisfiable subsets. The MSSes are filled with a background color. A CNF formula is a conjunction of disjunctions (clauses) of literals. A CNF formula can be also viewed as a multiset of clauses where a clause is a set of literals; the two representations are used interchangeably and are clear from the context. Throughout the whole text, we use F to denote the input CNF formula of interest. We use capital letters, e.g., S, K, N, or blackboard bold letters, e.g., W, R, to denote other formulas, small letters, e.g., f, f1, fi, to denote clauses, and small letters, e.g., x, x , y, to denote variables. Finally, given a set X, we write P(X) to denote the powerset of X, and |X| to denote the cardinality of X. Definition 1 (MSS). A set N, N F, is a maximal satisfiable subset (MSS) of F iff N is satisfiable and every N , such that N N F, is unsatisfiable. Definition 2 (MCS). A set N, N F, is a minimal correction subset (MCS) of F iff F \ N is satisfiable and for all N , such that N N, the set F \ N is unsatisfiable. Equivalently, N is a MCS of F iff F \ N is a MSS of F. Note that the maximality is the set maximality and not the maximum cardinality as e.g. in the Max SAT problem. Consequently, there can be MSSes with different cardinalities, and in total, there can be up to exponentially many MSSes of F w.r.t. |F| (see the Sperner s theorem (Sperner 1928)). The same applies also to MCSes. We write MSSF to denote the set of all MSSes of F, MCSF to denote the set of all MCSes of F, SSF to denote the set of all satisfiable subsets of F, and non MSSF to denote the set SSF \ MSSF of all satisfiable subsets of F that are not MSSes. Example 1. We demonstrate the concepts of MSSes and MCSes on an example, illustrated in Fig. 1. Assume that F = {f1 = {x1}, f2 = { x1}, f3 = {x2}, f4 = { x1, x2}}. There are 3 MSSes: MSSF = {{f2, f3, f4}, {f1, f4}, {f1, f3}}, and thus also 3 MCSes: MCSF = {{f1}, {f2, f3}, {f2, f4}}. Problem Definitions In this paper, we are concerned with the following three problems. Name: #MSS Input: A formula F. Output: The number |MSSF | of MSSes of F. Name: #MCS Input: A formula F. Output: The number |MCSF | of MCSes of F. Name: proj-#SAT Input: A formula F and a set of variables S Vars(F). Output: The number MF S of models of F projected on S. Our goal is to solve the #MCS and #MSS problems. Since MCSes are complements of MSSes, the two problems are equivalent. Thus, in the rest of the paper, we focus only on the #MSS problem. Finally, we do not focus on solving the proj-#SAT problem; instead, we propose several reductions of #MSS to proj-#SAT. Related Work MSS Counting As far as we know, there is no algorithm dedicated to counting MSSes. A straightforward approach to determine the count is to enumerate all the MSSes via an MSS enumeration algorithm, e.g., (Bailey and Stuckey 2005; Stern et al. 2012; Liffiton et al. 2016; Marques-Silva et al. 2013; Bend ık et al. 2016; Narodytska et al. 2018; Previti et al. 2018; Bend ık and ˇCern a 2020), and then simply count the enumerated MSSes. However, the complete MSS enumeration is often practically intractable, since there can be exponentially many MSSes w.r.t. |F| and thus, the MSSes just cannot be explicitly enumerated in a reasonable time. Another possible solution how to count MSSes (MCSes) is based on a well-known duality between MCSes and socalled minimal unsatisfiable subsets (MUSes) of F. A set N is a MUS of F iff N is unsatisfiable and f N the set N \ {f} is satisfiable. The hitting set duality (Reiter 1987; de Kleer and Williams 1987) between MUSes and MCSes says that every MCS is a minimal hitting set of the set of all MUSes of F. Consequently, one can first use a MUS enumeration tool, e.g., (Bailey and Stuckey 2005; Stern et al. 2012; Liffiton et al. 2016; Bacchus and Katsirelos 2015, 2016; Narodytska et al. 2018; Bend ık, ˇCern a, and Beneˇs 2018; Liu and Luo 2018; Bend ık and ˇCern a 2020a,b), to identify all MUSes of F, and then count the number of minimal hitting sets of the MUSes. The problem is that there can be also exponentially many MUSes w.r.t. |F|, which makes the MUS enumeration also often practically intractable. It is also worth remarking a recent hashing-based approach, called AMUSIC, to approximate counting of MUSes of a given formula (Bend ık and Meel 2020). AMUSIC identifies an approximate MUS count via polynomially many calls of a ΣP 3 oracle. In contrast, we focus on the exact counting and while the notions of MUSes and MSSes share a close relationship, we are unaware of any efficient reduction of the MSS counting problem to the MUS counting problem. Model Counting In his seminal paper (Valiant 1979), Valiant showed that the problem of propositional model counting (i.e., proj-#SAT when S = V ars(F)) is #Pcomplete. Durand, Hermann, and Kolaitis (2005) showed that the general problem of proj-#SAT is #NP-hard. From a practical perspective, the earliest work on model counting dates to Birnbaum and Lozinskii (1999), which sought to rely on smarter enumeration strategies of partial solutions. Subsequently, Bayardo and Pehoushek (Bayardo Jr and Pehoushek 2000) introduced the notion of component caching, wherein a residual formula after substituting the current partial assignment can be partitioned into different subsets of clauses such that these subsets do not share variables. Each of these subsets is called a component, and the model count of the formula is obtained by multiplying the corresponding counts for each of the components. Therefore, the model count is often determined by explicitly identifying only a fraction of all models. Caching scheme is used to avoid recomputation as similar components appear in different parts of the search space. Over the past two decades, there has been a series of algorithmic and system-driven improvements of component caching-based model counting techniques (Sang et al. 2004; Sang, Beame, and Kautz 2005; Thurley 2006; Muise et al. 2012; Sharma et al. 2019). There has been also an extensive research on compilationbased model counting including e.g. C2D (Darwiche 2004), SDD (Darwiche 2011), and D4 (Lagniez and Marquis 2017). Over the past 5 years, there has been a concentrated effort on developing efficient projected model counting techniques (Chakraborty et al. 2014; Aziz et al. 2015; Chakraborty, Meel, and Vardi 2016; M ohle and Biere 2018; Sharma et al. 2019; Lagniez and Marquis 2019). These techniques rely on appropriate modifications of standard propositional model counters (as described above). In this work, we rely on the state of the art projected model counter GANAK (Sharma et al. 2019), which was part of the system that won the Projected model counting track at the recently organized model counting competition. Counting the Number of MSSes Basic Idea Our approach for finding the MSS count |MSSF | is based on a simple observation: one can count the number |SSF | of all satisfiable subsets of F, the number |non MSSF | of satisfiable subsets that are not MSSes, and then do the math: |MSSF | = |SSF | |non MSSF |. In fact, even a more general observation holds: Definition 3 (wrapper and remainder). A set W of subsets of F is a wrapper iff MSSF W SSF . Futhermore, the remainder of W is the set R = W non MSSF . Proposition 1. Let W be a wrapper and R its remainder. Then |MSSF | = |W| |R|. Proof. SSF = MSSF non MSSF , and MSSF non MSSF = , hence MSSF = W \ non MSSF = W \ R, and since R W, it holds |MSSF | = |W| |R|. Our approach for counting MSSes consists of the following steps. First, we find a suitable wrapper W and the corresponding remainder R. Then, we encode the wrapper W and the corresponding remainder R with two formulas, W and R, such that each projected model of W and R corresponds to an element of W and R, respectively. Finally, we use a projected model counting tool to count the models of W and R, and hence to determine |W| and |R| which yields also the MSS count |MSSF | (Proposition 1). In the following section, we provide details on what is and how to find a suitable wrapper, how to build the formulas W and R, and what is the projection set. Wrappers and Remainders In this section, we gradually present 4 different wrappers W1, . . . , W4 and their corresponding remainders R1, . . . , R4. For each wrapper Wi and its remainder Ri, we also build the corresponding formulas Wi and Ri. Subsequently, we show how to combine multiple wrappers into a single, possibly more efficient, wrapper. Wrapper W1 Our first wrapper W1 is simply the set SSF of all satisfiable subsets of F, and the corresponding remainder R1 is thus SSF non MSSF = non MSSF . We build the formula W1 using the variables Vars(F) of F and an additional set of activation variables A = {af | f F}: f F (f af) (1) Intuitively, by setting a variable af to 1, we activate the sub-clause f in the clause f = (f af) of W1 since satisfying f is then the only way to satisfy f . Let us denote by AF(π) the one-to-one mapping (bijection) between a valuation π of A and the corresponding set of activated clauses of F, i.e., AF(π) = {f F | π(af) = 1}. Proposition 2. For every valuation π of A, π MW1 A iff AF(π) W1. Consequently, |MW1 A| = |W1|. Proof. : Let π be a model of W1 such that π A = π. We show that π |= AF(π), hence AF(π) is satisfiable and belongs to W1. For every f AF(π), we know that π |= (f af). Moreover, by the definition of AF(π), π(af) = 1 = π (af), i.e., π |= af, and thus π |= f. : Let N be an element of W1 and φ its model. We define a valuation π of W1 as π (x) = φ(x) if x Vars(F), π (af) = 0 if f N, and π (af) = 1 if f N. Observe that AF(π A) = N. Furthermore, π |= W1: every clause (f af) W1 such that f N is inherently satisfied by φ, and every clause (f af) W1 such that f N is satisfied by π (af) = 0. Hence, π = π A MW1 A To build the formula R1 that encodes the remainder R1 = SSF non MSSF , i.e., the set of all satisfiable subsets of F that are not MSSes, we introduce another set B of activation variables B = {bf | f F}. Similarly as in the case of A, given a valuation π of B, let us by BF(π) denote the subset {f F | π(bf) = 1} of F. By the definition of a MSS, a satisfiable subset N of F is not a MSS iff there exists a satisfiable N such that N N F. We use AF(π) and BF(π) to encode such N and N , respectively, in R1: f F (af = bf) f F ( af bf) (2) Intuitively, the first conjunct W1 encodes that AF(π) is satisfiable, the second conjunct encodes that BF(π) is satisfiable, and the last two conjuncts express that AF(π) BF(π). Note that since both the first conjuncts reason about satisfiability of subsets of F, we use in the second conjunct a primed version F of F, i.e. a copy of F where each literal is substituted by its primed version. Proposition 3. For every valuation π of A, π MR1 A iff AF(π) R1. Consequently, |MR1 A| = |R1|. Proof. : Let π be a model of R1 such that π A = π. Since R1 subsumes W1, π |= W1, and hence by Proposition 2 AF(π) is satisfiable. The set BF(π B) is defined and constrained in R1 analogously to AF(π), thus BF(π B) is also satisfiable. Furthermore, AF(π) BF(π B) since π |= V f F (af = bf) W f F ( af bf). : Let N be an element of R1, N a satisfiable superset of N, φ a model of N, and φ a model of N . We build a model π of R1 as π (x) = φ(x) if x Vars(F), π (af) = 0 if f N, π (af) = 1 if f N, π (x) = φ (x ) if x Vars(F ), π (bf) = 0 if f N , and π (bf) = 1 if f N . Analogously to the proof of Proposition 2, we know that π |= W1 V f F (f bf). As for the remaining part of R1, since N N , we have that π (af) = 1 implies π (bf) = 1 for every f F, and that there is at least one f F such that π (af) = 0 and π (bf) = 1. Based on the above observations, we can use a projected model counter to determine the cardinalities |W1| and |R1|, and then employ Proposition 1 to deduce the MSS count. However, due to the complexity of projected model counting, determining the model counts for W1 and R1 can be practically intractable. In general, there are two main criteria that affect the practical efficiency of the projected model counting. One criterion is the cardinality of the projection set, which is in the case of W1 |A| = |F|. The other criterion is the number of the models, i.e., |W1|, which can be exponential w.r.t. |F| since there are 2|F | subsets of F and all of them (excluding the whole F) can be satisfiable. In the following, we propose three other wrappers (and corresponding remainders) that tend to optimize these two criteria by providing a better description of MSSes. All formulas that encode the following wrappers and their remainders use the same variables as in the case of W1 and R1, i.e., Vars(F) Vars(F ) A B. We also use the notation AF(π) and BF(π) to map valuations of A and B, respectively, to subsets of F. Wrapper W2 Our second wrapper, W2, exploits the intersection IMSSF of all MSSes of F. Clearly, for every MSS N MSSF it holds that IMSSF N. Thus, we could define the next wrapper as the set of all satisfiable subsets of F that are supersets of IMSSF . Unfortunately, computing the intersection can be very expensive (see below), thus, we exploit a more general observation: Observation 1. For every MSS N MSSF and every underapproximation I of IMSSF , i.e., I IMSSF , it holds that I N. Given an under-approximation I of IMSSF , we define the second wrapper as W2 = {N SSF | I N}. The formulas W2 and R2 that encode W2 and R2, respectively, are defined as follows: R2 = W2 R1 (4) Proposition 4. For every valuation π of A, π MW2 A iff AF(π) W2. Consequently, |MW2 A| = |W2|. Proof. : W2 subsumes W1, thus for every π MW2 A the set AF(π) is satisfiable (Proposition 2), and since f AF(π) iff π |= af, V f I af ensures that I AF(π). : Given N W2 and a model φ of N, we build a valuation π of W2 as π (x) = φ(x) if x Vars(F), π (af) = 0 if f N, and π (af) = 1 if f N. Similarly as in the proof of Proposition 2, observe that AF(π A) = N and that π |= W2, thus π = π A MW2 A. Proposition 5. For every valuation π of A, π MR2 A iff AF(π) R2. Consequently, |MR2 A| = |R2|. Proof. MR2 A = M(W2 R1) A = MW2 A MR1 A = {π | AF(π) W2} {π | AF(π) R1} = {π | AF(π) W2 R1} = {π | AF(π) R2}. The other direction holds since AF is a one-to-one mapping (bijection). Note that by enforcing the variables {af | f I} to be set to 1, we effectively reduce the size of the projection set A since all models of W2 (and R2) agree on the assignment to these variables. In other words, |MW2 A| = |MW2 (A\{af | f I})| (and |MR2 A| = |MR2 (A\{af | f I})|). Based on our practical experience, the IMSSF is often relatively large, i.e., I can be also relatively large, and thus we can significantly reduce the projection set. The remaining question is how to compute either exactly IMSSF or at least its under-approximation I. We are not aware of any work that would be dedicated to computing the intersection IMSSF of all MSSes of F. Yet, as shown in (Kullmann 2000a), it holds that IMSSF = F \ UMUSF where UMUSF is the union of all minimal unsatisfiable subsets (MUSes) of F. Unfortunately, based on a recent study (Menc ıa et al. 2019), computing UMUSF , and hence also IMSSF , is often practically intractable even for relatively small formulas. On the other hand, it is often possible to cheaply compute a good over-approximation of UMUSF via the concepts of autark variables and lean kernel. A set A Vars(F) is an autark of F iff there exists a truth assignment to A such that every clause of F that contains a variable from A is satisfied by the assignment (Monien and Speckenmeyer 1985). It is known (Kleine B uning and Kullmann 2009; Kullmann 2000b) that the union of two autark sets is also an autark set, and thus there exist a unique largest autark set of F. The lean kernel K of F is the set of all clauses that do not contain any variable from the largest autark set. It holds (Kleine B uning and Kullmann 2009; Kullmann 2000b) that the lean kernel K of F is an over-approximation of UMUSF . Consequently, F \ K is an under-approximation of IMSSF . We employ an approach from (Marques-Silva et al. 2014) to compute the lean kernel K and then use I = F \ K to build the wrapper W2. Similarly as we used the intersection IMSSF , one might think of exploiting the union UMSSF of all MSSes; clearly, every MSS of F is contained in UMSSF . Thus, at first glance, it makes sense to build a wrapper that contains all satisfiable subsets of F that are subsets of UMSSF . Yet, we observe that UMSSF = F and thus the use of the union would not bring any benefit compared to W1: Proposition 6. For every formula F (with no empty clause) and the union UMSSF of all MSSes of F, it holds that F = UMSSF . Proof. By contradiction, assume a clause f F that is not contained in any MSS of F. Since f is non-empty, it is necessarily satisfiable, and hence either {f} is a MSS of F or there exists an MSS N of F such that N {f}. Wrapper W3 Our next wrapper, W3, is based on the following characterization of MSSes: Proposition 7. For every MSS N of F there exists a valuation φ of Vars(F) such that φ |= N and for every f F \N it holds that φ |= f. Proof. N is satisfiable, thus it has a model. For every model φ of N, if φ |= f for some f F \ N, then φ |= N {f} which contradicts that N is a MSS. The corresponding wrapper for the property stated in Propositon 7 is W3 = {N SSF | φ Vals(F).φ |= N V f F \N f}. The formulas W3 and R3 encoding W3 and R3, respectively, are the following: f F ( af f) (5) R3 = W3 R1 (6) Proposition 8. For every valuation π of A, π MW3 A iff AF(π) W3. Consequently, |MW3 A| = |W3|. Proof. : W3 subsumes W1, thus for every model π of W3 such that π = π A it holds that π |= AF(π) (Proposition 2). Furthermore, by the definition of AF, f AF(π) iff π |= af (i.e., π(af) = 0). Thus, the clauses V f F ( af f) of W3 ensure that for all f F \ AF(π) it holds that π |= f. Hence, π is the model φ from the definition of W3. : Given N W3 and a model φ of N V f F \N f (by the definition of W3), we build a valuation π of W3 as π (x) = φ(x) if x Vars(F), π (af) = 0 if f N, and π (af) = 1 if f N. Similarly as in the proof of Proposition 2, observe that AF(π A) = N and that π |= W3, thus π = π A MW3 A. Proposition 9. For every valuation π of A, π MR3 A iff AF(π) R3. Consequently, |MR3 A| = |R3|. Proof. MR3 A = M(W3 R1) A = MW3 A MR1 A = {π | AF(π) W3} {π | AF(π) R1} = {π | AF(π) W3 R1} = {π | AF(π) R3}. The other direction holds since AF is a one-to-one mapping (bijection). Wrapper W4 Our next wrapper, W4, is based on another property of MSSes. Intuitively, assume an MSS N of F and a model φ of Vars(F). We observe that for every clause f F \ N and every literal l of the clause, there is a clause g N that forces l to be falsified. If there would be l that is not forced to be falsified, then the model φ can be relaxed to a model φ that would satisfy N {f} (which is not possible since N is a MSS). Formally: Proposition 10. Let N be an MSS of F and φ a valuation of Vars(F). If φ |= N, then φ |= V f F \N P, where P = V l f W g {g N | l g} V k g\{ l} k. Proof. By contradiction, let φ be a model of N such that φ |= V f F \N P. Hence, there exists f F \ N and l f such that φ |= W g {g N | l g} V k g\{ l} k. In other words, for every clause g G = {g N | l g} there is a literal k g, k = l, with φ |= k. Now, assume that we turn φ into a valuation φ by only flipping the assignment to l, i.e., φ |= l. Clearly, φ |= f since l f. Furthermore, φ |= N \ G since these clauses do not contain l and thus the change of the assignment to l does not affect them. Finally, φ |= G since every g G contains a literal k, k = l, with φ |= k (and φ agrees with φ on k). Hence, N {f} is satisfiable, which contradicts that N is a MSS. Unfortunately, the proposition reasons about all models of a MSS (i.e., a universal property), which is expensive to encode with a propositional formula. Yet, since every MSS has at least a single model, we can relatively cheaply encode a weaker, existential, variant of Proposition 10. We define W4 as W4 = {N SSF | φ Vals(F).φ |= N V f F \N P}, where P = V l f W g {g N | l g} V k g\{ l} k. We encode W4 and its reminder R4 via W4 and R4 as follows: f F af = P , where g {g F | l g} (ag k g\{ l} k) (7) R4 = W4 R1 (8) Proposition 11. For every valuation π of A, π MW4 A iff AF(π) W4. Consequently, |MW4 A| = |W4|. Proof. : Let π be a model of W4 such that π = π A. We show that π and AF(π) comply with the conditions on φ and N, respectively, from the definition of W4. W4 subsumes W1, thus π |= AF(π) (Proposition 2). Furthermore, since π |= W4 and π = π A, it holds that π |= W4[π]. Finally, as f AF(π) iff π |= af, observe that W4[π] V f F \AF(π) V l f W g {g AF(π) | l g} V k g\{ l} k (which is the condition on φ). : Given N W4 and a valuation φ such that φ |= N V f F \N P (as in the definition of W4) , we build a model π of W4 same as we did in the proof of Proposition 2, i.e., π (x) = φ(x) if x Vars(F), π (af) = 0 if f N, and π (af) = 1 if f N. Like in the proof of Proposition 2, it holds that AF(π A) = N and π |= W1. As for the rest of W4, it generally holds that π |= (V f F af = P ) iff π |= (V f F af = P )[π A]. Furthermore, (V f F af = P )[π A] V f F \N P, since π A |= af iff f N. Finally, φ |= V f F \N P and π agrees with φ on Vars(V f F \N P), hence π |= V f F \N P. Proposition 12. For every valuation π of A, π MR4 A iff AF(π) R4. Consequently, |MR4 A| = |R4|. Proof. MR4 A = M(W4 R1) A = MW4 A MR1 A = {π | AF(π) W4} {π | AF(π) R1} = {π | AF(π) W4 R1} = {π | AF(π) R4}. The other direction holds since AF is a one-to-one mapping (bijection). Combining The Wrappers Proposition 13. For every two wrappers Wi, Wj {W1, . . . , W4} and their remainders Ri, Rj, it holds: 1. Wi Wj is a wrapper, and Ri Rj is its remainder. 2. For every valuation π of A, π M(Wi Wj) A iff AF(π) Wi Wj. Consequently, |M(Wi Wj) A| = |Wi Wj|. 3. For every valuation π of A, π M(Ri Rj) A iff AF(π) Ri Rj. Consequently, |M(Ri Rj) A| = |Ri Rj|. 1. By Definition 3, a set W is a wrapper iff MSSF W SSF , and the remainder of W is R = W non MSSF . Wi and Wj are wrappers, thus MSSF Wi, Wj SSF , and hence MSSF Wi Wj SSF . Furthermore, Ri = Wi non MSSF and Rj = Wj non MSSF , hence Ri Rj = Wi Wj non MSSF . 2. By Propositions 2, 4, 8 and 11, M(Wi Wj) A = MWi A MWj A = {π | AF(π) Wi} {π | AF(π) Wj} = {π | AF(π) Wi Wj}. The other direction holds since AF is a one-to-one mapping (bijection). 3. By Propositions 3, 5, 9 and 12, M(Ri Rj) A = MRi A MRj A = {π | AF(π) Ri} {π | AF(π) Rj} = {π | AF(π) Ri Rj}. The other direction holds since AF is a one-to-one mapping (bijection). Note that all the formulas W2, W3 and W4 use as a subformula W1. Similarly, all the formulas R2, R3 and R4 use as a subformula R1. Thus, if we combine two wrappers, we duplicate some clauses in the formulas. In our implementation, we first remove all duplicated clauses from a formula before we pass it to a model counting tool. This simplification is sound as it does not reduce the number of models of the formula. On Choice of Projected Model Counting We remark on the choice of reduction of MSS counting to projected model counting. One might wonder whether we could have reduced to the classical problem of model counting. In this context, note that the classical model counting is #P-complete and checking whether an assignment satisfies a CNF formula is in P. In contrast, checking whether a given set of clauses is an MSS is in DP, and therefore, it is expected to rely on a problem that is perhaps harder than classical model counting from the complexity perspective. Therefore, projected model counting, which is in #NP-hard, is a good choice given its hardness and the recent development of efficient techniques. Experimental Evaluation We have implemented our approach for solving the #MSS problem in a python-based tool. To count the number of projected models of the wrappers, we use the model counter GANAK (Sharma et al. 2019). Furthermore, we employ the Max SAT solver UWr Max Sat (Piotr ow 2019) as a backend while computing the under-approximation I of the intersection of MSSes in the case of the wrapper W2. Our tool is publicly available at: https://github.com/jar-ben/MSSCounting In this section, we experimentally compare the four wrappers, W1, . . . , W4, and their combinations for the task of determining the MSS count of a given formula. Note that the wrapper W1 is by the definition subsumed by the remaining three wrappers. Therefore, there are only 8 possible combined wrappers (according to Proposition 13) that make sense: W1 = W1, W2 = W2, W3 = W3, W4 = W4, W23 = W2 W3, W24 = W2 W4, W34 = W3 W4, W234 = W2 W3 W4. At first glance, the wrapper W234 that combines all the base wrappers should be the most suitable one since it provides the most accurate description of MSSes. However, the Boolean formula that describes this wrapper is also the largest one in the number of clauses, and thus it might be hard to deal with for the model counter. Therefore, it makes sense to evaluate all the 8 combinations. Moreover, we compare our wrapper-based approach for counting MSSes with the contemporary MSS counting approach: complete MSS enumeration via an MSS enumeration tool. In particular, we evaluate two contemporary MSS enumeration tools: FLINT (Narodytska et al. 2018)1, and RIME (Bend ık and ˇCern a 2020)2. Thus, in total, we compare ten tools (RIME, FLINT, and our approach using one of the eight wrappers). We use three comparison criteria: 1) the number of benchmarks for which the tools provide the MSS count, 2) the time to provide the MSS count, and 3) the scalability of the tools w.r.t. the MSS count. We used a collection of 1200 Boolean CNF formulas that were recently used in prior MUS literature (Liu and Luo 1The implementation of FLINT was kindly provided to us by its author, Nina Narodytska. 2https://github.com/jar-ben/rime 100 200 300 400 500 600 elapsed time in seconds solved benchmarks Figure 2: The number of finished benchmarks in time. 0 100 200 300 400 500 600 individual benchmarks Figure 3: Scalability w.r.t. the MSS count. 2018; Luo and Liu 2019)3. The benchmarks contain from 100 to 1000 clauses, use from 50 to 996 variables, and have from 2 to at least 4.74 1012 MSSes (the highest MSS count revealed in our evaluation). All experiments were run using a time limit of 3600 seconds (1 hour) and computed on an AMD EPYC 7371 16Core Processor, 1 TB memory machine running Debian Linux 4.19.67-2. Number of Completed Benchmarks We now examine the number of benchmarks for which the evaluated tools determined the MSS count; we simply say that the tools solved the benchmarks. Only 515 of the 1200 benchmarks were solved by at least one of the tools. Furthermore, 353 benchmarks were solved either by FLINT or by RIME, and 510 benchmarks were solved using one of our wrapper-based tools. The cactus plot in Figure 2 shows for each tool the number of solved benchmarks and the time to solve the benchmarks. In particular, a point with coordinates [x, y] means that there are x benchmarks for which the corresponding tool provided the MSS count within the first y seconds of the computation. There are only 327 and 347 benchmarks where FLINT and RIME provided the MSS count, respectively. As for our approach, the best result was achieved by 3https://github.com/luojie-sklsde/MUS Random Benchmarks FLINT RIME W234 W24 W23 FLINT 6;26 2;181 2;181 2;146 RIME 26;6 5;164 5;164 5;129 W234 181;2 164;5 4;4 35;0 W24 181;2 164;5 4;4 38;3 W23 146;2 129;5 0;35 3;38 Table 1: Number of benchmarks where a tool solved more;fewer benchmarks than the other tools. the wrappers W234 and W24 which both solved 506 benchmarks, i.e., there is an incredible improvement of 46 percent over the best MSS enumerator RIME. W23 solved 471 benchmarks, which is also a solid result. On the other hand, W2, W4, and W34 solved only 209, 195, and 197 benchmarks, respectively. W1 and W3 did not solve even a single benchmark. We also show in the plot the result obtained by the virtual best system (VBS), i.e., for each benchmark we consider the fastest of the evaluated tools. Note that W234 and W24 performed almost as good as the VBS. There are 4 benchmarks that were solved by W234, but not by W24, and vice versa. Table 1 pair-wise compares FLINT, RIME, and the three best wrappers, W234, W24, and W23, w.r.t. this criterion. Each cell contains two numbers, k; l, expressing that there are k benchmarks that were solved by the tool labeling the row but not by the tool labeling the column, and vice versa for l. There are only 2 and 5 benchmarks that were solved by FLINT and RIME, respectively, and that were not solved by any of our wrappers. Scalability W.R.T. the MSS Count An MSS enumerator (e.g., FLINT or RIME) has to explicitly enumerate all MSSes to obtain the MSS count. Consequently, if the MSS count is large, the complete enumeration naturally becomes practically intractable (w.r.t. a reasonable time limit). On the other hand, our approach reduces the MSS counting to model counting, and it is often the case that a model counter needs to explicitly identify only a fraction of the models. Consequently, our approach should hypothetically scale much better w.r.t. the MSS count. To prove our hypothesis, we compare the best MSS enumeration tool, RIME, with our best wrapper, W234. The plot in Figure 3 shows on the x-axis the benchmarks that were completed by at least one of the two tools and on the yaxis the MSS count of the benchmarks. The benchmarks are sorted by the MSS count. In the case of RIME, the points in the plot show the number of enumerated MSSes within the given time limit, i.e., it is either the exact MSS count or its under-approximation. RIME was able to solve only benchmarks with at most 106 MSSes. On the other hand, W234 solved even benchmarks that contain 1012 MSSes, i.e., it scales much better. Note that we show in the plot also the 5 benchmarks that were solved by RIME but not by W234; these are illustrated as the 5 points on the x-axis. Conclusion Motivated by the progress in model counting, we initiate the study of counting the number of MSSes of a given formula. Our novel algorithmic framework relies on the notions of wrappers and their corresponding remainders. We show that wrappers and remainders compose, and the computation of the sizes of wrappers and remainders reduces to the projected model counting. The availability of an efficient projected model counter, GANAK, allowed our MSS counting approach to scale, in terms of the MSS count, significantly better than alternative approaches based on the MSS enumeration. As for the future work, we would like to address also a better scaling of our approach w.r.t. the number of clauses in the input instance. Whereas we are currently able to handle instances with hundreds of clauses, instances with high thousands or even millions of clauses are still out of our reach. In this context, a promising challenge would be to handle the widely used dataset of 300 CNF formulas from the MUS track of the SAT Competition 2011. A vast majority of benchmarks from this set is not tractable for contemporary MSS enumeration tools due to a large number of MSSes, and it is also not tractable for our approach owing to a large number of clauses in the benchmarks, which in turn leads to an increase in the number of variables for projected model counting queries. An interesting direction to address this scalability challenge is to investigate whether a component caching-based scheme operating natively over the space of MSSes, i.e., avoiding the reduction to model counting, can lead to a better runtime efficiency. Another line of future work is to evaluate other contemporary projected model counting tools such as nest HDB (Hecher, Thier, and Woltran 2020) or proj MC (Lagniez and Marquis 2019), and to employ preprocessing techniques such as (Manthey 2012) or (Lagniez, Lonca, and Marquis 2020). Finally, we plan to examine an extension of our MSS counting approach to other constraint domains where MSSes find an application, e.g., F can be a set of LTL (Barnat et al. 2016; Bend ık 2017) or SMT (Guthmann, Strichman, and Trostanetski 2016) formulas. Acknowledgements This work was supported in part by the National Research Foundation Singapore under its NRF Fellowship Programme [NRF-NRFFAI1-2019-0004] and the AI Singapore Programme [AISG-RP-2018-005], NUS ODPRT Grant [R252-000-685-13]. Aziz, R. A.; Chu, G.; Muise, C.; and Stuckey, P. 2015. exists SAT: Projected Model Counting. In International Conference on Theory and Applications of Satisfiability Testing, 121 137. Springer. Bacchus, F.; and Katsirelos, G. 2015. Using Minimal Correction Sets to More Efficiently Compute Minimal Unsatisfiable Sets. In CAV (2), volume 9207 of LNCS, 70 86. Springer. Bacchus, F.; and Katsirelos, G. 2016. Finding a Collection of MUSes Incrementally. In CPAIOR, volume 9676 of LNCS, 35 44. Springer. Bailey, J.; and Stuckey, P. J. 2005. Discovery of minimal unsatisfiable subsets of constraints using hitting set dualization. In PADL, 174 186. Springer. Baluta, T.; Shen, S.; Shinde, S.; Meel, K. S.; and Saxena, P. 2019. Quantitative verification of neural networks and its security applications. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, 1249 1264. Barnat, J.; Bauch, P.; Beneˇs, N.; Brim, L.; Beran, J.; and Kratochv ıla, T. 2016. Analysing sanity of requirements for avionics systems. FAo C 1 19. Bayardo Jr, R. J.; and Pehoushek, J. D. 2000. Counting models using connected components. In AAAI/IAAI, 157 162. Belov, A.; Lynce, I.; and Marques-Silva, J. 2012. Towards efficient MUS extraction. AI Commun. 25(2): 97 116. Bend ık, J. 2017. Consistency checking in requirements analysis. In ISSTA, 408 411. ACM. Bend ık, J.; Beneˇs, N.; ˇCern a, I.; and Barnat, J. 2016. Tunable Online MUS/MSS Enumeration. In FSTTCS, volume 65 of LIPIcs, 50:1 50:13. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik. Bend ık, J.; and ˇCern a, I. 2020a. MUST: Minimal Unsatisfiable Subsets Enumeration Tool. In TACAS (1), volume 12078 of LNCS, 135 152. Springer. Bend ık, J.; and ˇCern a, I. 2020b. Replication-Guided Enumeration of Minimal Unsatisfiable Subsets. In CP, volume 12333 of LNCS, 37 54. Springer. Bend ık, J.; and ˇCern a, I. 2020. Rotation Based MSS/MCS Enumeration. In LPAR, volume 73 of EPi C Series in Computing, 120 137. Easy Chair. Bend ık, J.; ˇCern a, I.; and Beneˇs, N. 2018. Recursive Online Enumeration of All Minimal Unsatisfiable Subsets. In ATVA, volume 11138 of LNCS, 143 159. Springer. Bend ık, J.; and Meel, K. S. 2020. Approximate Counting of Minimal Unsatisfiable Subsets. In CAV (1), volume 12224 of LNCS, 439 462. Springer. Besnard, P.; Gr egoire, E.; and Lagniez, J.-M. J. 2015. On computing maximal subsets of clauses that must be satisfiable with possibly mutually-contradictory assumptive contexts. In Twenty-Ninth AAAI Conference on Artificial Intelligence. Biondi, F.; Enescu, M. A.; Heuser, A.; Legay, A.; Meel, K. S.; and Quilbeuf, J. 2018. Scalable approximation of quantitative information flow in programs. In International Conference on Verification, Model Checking, and Abstract Interpretation, 71 93. Springer. Birnbaum, E.; and Lozinskii, E. L. 1999. The good old Davis-Putnam procedure helps counting models. Journal of Artificial Intelligence Research 10: 457 477. Chakraborty, S.; Fremont, D. J.; Meel, K. S.; Seshia, S. A.; and Vardi, M. Y. 2014. Distribution-Aware Sampling and Weighted Model Counting for SAT. In Proc. of AAAI, 1722 1730. Chakraborty, S.; Meel, K. S.; and Vardi, M. Y. 2016. Algorithmic Improvements in Approximate Counting for Probabilistic Inference: From Linear to Logarithmic SAT Calls. In Proc. of IJCAI. Darwiche, A. 2004. New Advances in Compiling CNF into Decomposable Negation Normal Form. In ECAI, 328 332. IOS Press. Darwiche, A. 2011. SDD: A New Canonical Representation of Propositional Knowledge Bases. In IJCAI, 819 826. IJCAI/AAAI. de Kleer, J.; and Williams, B. C. 1987. Diagnosing Multiple Faults. Artif. Intell. 32(1): 97 130. Durand, A.; Hermann, M.; and Kolaitis, P. G. 2005. Subtractive reductions and complete problems for counting complexity classes. Theoretical Computer Science 340(3): 496 513. Guthmann, O.; Strichman, O.; and Trostanetski, A. 2016. Minimal unsatisfiable core extraction for SMT. In FMCAD, 57 64. IEEE. Hecher, M.; Thier, P.; and Woltran, S. 2020. Taming High Treewidth with Abstraction, Nested Dynamic Programming, and Database Technology. In SAT, volume 12178 of LNCS, 343 360. Springer. Hunter, A.; and Konieczny, S. 2008. Measuring Inconsistency through Minimal Inconsistent Sets. In KR, 358 366. AAAI Press. Kleine B uning, H.; and Kullmann, O. 2009. Minimal Unsatisfiability and Autarkies. In Handbook of Satisfiability, volume 185 of FAIA, 339 401. IOS Press. Kullmann, O. 2000a. An Application of Matroid Theory to the SAT Problem. In Computational Complexity Conference, 116. IEEE Computer Society. Kullmann, O. 2000b. Investigations on autark assignments. Discrete Applied Mathematics 107(1-3): 99 137. Lagniez, J.; Lonca, E.; and Marquis, P. 2020. Definability for model counting. Artif. Intell. 281: 103229. Lagniez, J.; and Marquis, P. 2017. An Improved Decision DNNF Compiler. In IJCAI, 667 673. ijcai.org. Lagniez, J.-M.; and Marquis, P. 2019. A recursive algorithm for projected model counting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 1536 1543. Liffiton, M. H.; Previti, A.; Malik, A.; and Marques-Silva, J. 2016. Fast, flexible MUS enumeration. Constraints 21(2): 223 250. Liffiton, M. H.; and Sakallah, K. A. 2008. Algorithms for computing minimal unsatisfiable subsets of constraints. JAR 40(1): 1 33. Liu, S.; and Luo, J. 2018. FMUS2: An Efficient Algorithm to Compute Minimal Unsatisfiable Subsets. In AISC, volume 11110 of LNCS, 104 118. Springer. Luo, J.; and Liu, S. 2019. Accelerating MUS enumeration by inconsistency graph partitioning. Science China Information Sciences 62(11): 212104. Manthey, N. 2012. Coprocessor 2.0 - A Flexible CNF Simplifier - (Tool Presentation). In SAT, volume 7317 of LNCS, 436 441. Springer. Marques-Silva, J.; Heras, F.; Janota, M.; Previti, A.; and Belov, A. 2013. On Computing Minimal Correction Subsets. In IJCAI, 615 622. IJCAI/AAAI. Marques-Silva, J.; Ignatiev, A.; Morgado, A.; Manquinho, V. M.; and Lynce, I. 2014. Efficient Autarkies. In ECAI, volume 263 of FAIA, 603 608. IOS Press. Menc ıa, C.; Kullmann, O.; Ignatiev, A.; and Marques-Silva, J. 2019. On Computing the Union of MUSes. In SAT, volume 11628 of LNCS, 211 221. Springer. Meseguer, P.; Bouhmala, N.; Bouzoubaa, T.; Irgens, M.; and S anchez-Fibla, M. 2003. Current Approaches for Solving Over-Constrained Problems. Constraints An Int. J. 8(1): 9 39. M ohle, S.; and Biere, A. 2018. Dualizing projected model counting. In 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI), 702 709. IEEE. Monien, B.; and Speckenmeyer, E. 1985. Solving satisfiability in less than 2n steps. Discrete Applied Mathematics 10(3): 287 295. Muise, C.; Mc Ilraith, S. A.; Beck, J. C.; and Hsu, E. I. 2012. D sharp: fast d-DNNF compilation with sharp SAT. In Canadian Conference on Artificial Intelligence, 356 361. Springer. Narodytska, N.; Bjørner, N.; Marinescu, M.; and Sagiv, M. 2018. Core-Guided Minimal Correction Set and Core Enumeration. In IJCAI, 1353 1361. ijcai.org. Piotr ow, M. 2019. UWr Max Sat-a new Mini Sat+-based Solver in Max SAT Evaluation 2019. Max SAT Evaluation 2019 11. Previti, A.; Menc ıa, C.; J arvisalo, M.; and Marques-Silva, J. 2018. Premise Set Caching for Enumerating Minimal Correction Subsets. In AAAI, 6633 6640. AAAI Press. Reiter, R. 1987. A Theory of Diagnosis from First Principles. Artif. Intell. 32(1): 57 95. Sang, T.; Bacchus, F.; Beame, P.; Kautz, H. A.; and Pitassi, T. 2004. Combining Component Caching and Clause Learning for Effective Model Counting. SAT 4: 7th. Sang, T.; Beame, P.; and Kautz, H. A. 2005. Performing Bayesian inference by weighted model counting. In AAAI, volume 5, 475 481. Sashittal, P.; and El-Kebir, M. 2020. Sampling and summarizing transmission trees with multi-strain infections. Bioinformatics 36(Supplement 1): i362 i370. Sharma, S.; Roy, S.; Soos, M.; and Meel, K. S. 2019. GANAK: A Scalable Probabilistic Exact Model Counter. In IJCAI, 1169 1176. ijcai.org. Sperner, E. 1928. Ein satz uber untermengen einer endlichen menge. Mathematische Zeitschrift 27(1): 544 548. Stern, R. T.; Kalech, M.; Feldman, A.; and Provan, G. M. 2012. Exploring the Duality in Conflict-Directed Model Based Diagnosis. In AAAI. AAAI Press. Thimm, M. 2018. On the evaluation of inconsistency measures. Measuring Inconsistency in Information 73. Thurley, M. 2006. sharp SAT counting models with advanced component caching and implicit BCP. In International Conference on Theory and Applications of Satisfiability Testing, 424 429. Springer. Valiant, L. G. 1979. The Complexity of Enumeration and Reliability Problems. SIAM J. Comput. 8(3): 410 421.