# sharpssat_a_witnessgenerating_stochastic_boolean_satisfiability_solver__66fca727.pdf

Sharp SSAT: A Witness-Generating Stochastic Boolean Satisfiability Solver

Yu-Wei Fan1, Jie-Hong R. Jiang1,2

1Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan 2Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan {r11943096, jhjiang}@ntu.edu.tw

Stochastic Boolean satisfiability (SSAT) is a formalism allowing decision-making for optimization under quantitative constraints. Although SSAT solvers are under active development, existing solvers do not provide Skolem-function witnesses, which are crucial for practical applications. In this work, we develop a new witness-generating SSAT solver, Sharp SSAT, which integrates techniques, including component caching, clause learning, and pure literal detection. It can generate a set of Skolem functions witnessing the attained satisfying probability of a given SSAT formula. We also equip the solver Clau SSat with witness generation capability for comparison. Experimental results show that Sharp SSAT outperforms current state-of-the-art solvers and can effectively generate compact Skolem-function witnesses. The new witness-generating solver may broaden the applicability of SSAT to practical applications.

1 Introduction Efficient solving techniques for Boolean satisfiability (SAT) problems have led to tremendous success in formal verification, synthesis, electronic design automation, and other domains that rely on Boolean reasoning. The growing complexity of verification and variety of applications have motivated researchers to investigate logical formalisms and problems beyond SAT. Recently, stochastic Boolean satisfiability (SSAT) has drawn attention from the field of probabilistic planning (Salmon and Poupart 2019), verification for probabilistic design (Lee and Jiang 2018), and fairness analysis of machine-learning models (Ghosh, Basu, and Meel 2021). It provides a convenient language to compactly encode decision and optimization problems under uncertainty. As more applications of SSAT emerge, developing effective solving techniques becomes crucial. Solving a given SSAT formula involves two important tasks: One is to determine the maximum satisfying probability of the formula. The other is to generate a Skolem-function witness, i.e., Skolem functions for existential variables, that witnesses the maximum satisfying probability. To date, existing SSAT solvers, e.g., DC-SSAT (Majercik and Boots 2005), Prime (Salmon and Poupart 2019), Clau SSat (Chen, Huang, and Jiang 2021), and

Copyright 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Elim SSAT (Wang et al. 2022), mostly focus on the first task. Specifically, DC-SSAT is a divide-and-conquer SSAT solver that solves an SSAT formula by decomposing it into smaller sub-problems. Prime exploits model counting techniques, including component analysis and advanced caching schemes, for SSAT solving. Clau SSat tackles SSAT solving based on the clause selection (Janota and Marques Silva 2015) framework. Elim SSAT proposes a quantifierelimination approach to SSAT solving. Regarding the second task, the witness can be crucial for practical application. For example, the witness may correspond to an access control policy between organizations (Freudenthal and Karamcheti 2003), and a feasible plan for planning (Littman, Majercik, and Pitassi 2001). Unfortunately, none of the existing SSAT solvers can provide a Skolem-function witness. We note that although DC-SSAT can maintain solution tree structures during solving, it still lacks a complete procedure to generate a full Skolem-function witness.

In this paper, a new SSAT solver, named Sharp SSAT, is developed based on the model counter Sharp SAT (Thurley 2006). Different from Prime, in addition to component analysis and caching, Sharp SSAT further combines clause learning and pure literal detection. To bridge the gap between SSAT solving and witness genreration, we propose two different witness generation algorithms under Sharp SSAT and Clau SSat. To the best of our knowledge, it is the first time that witness generation is supported for SSAT solvers. Experiments on various benchmarks show the superiority in performance Sharp SSAT to state-ofthe-art solvers, including DC-SSAT, Prime, Clau SSat, and Elim SSAT. The witness generation overhead and witness quality are compared between Sharp SSAT and Clau SSat. The results suggest that Sharp SSAT can generate more compact strategies with less run-time but more memory overhead than Clau SSat.

The rest of this paper is organized as follows. The required preliminaries are provided in Section 2 The SSAT solving and witness generation algorithms of Sharp SSAT are detailed in Section 3 and Section 4, respectively. Section 5 elaborates the witness generation algorithm for Clau SSat. Section 6 evaluates Sharp SSAT with experimental results. Section 7 concludes this work and outline some future work.

The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23)

2 Preliminaries

We use the symbols and to represent Boolean values TRUE and FALSE, respectively. Symbols , , , and denote Boolean connectives negation, disjunction, conjunction, and equivalence, respectively. For brevity, in a formula, a conjunction is sometimes omitted, and a negation is sometimes denoted with an overline. A literal l of a variable x is either a positive-phase literal x or a negativephase literal x. We denote the variable corresponding to l by var(l). A clause is a disjunction of literals. A cube is a conjunction of literals. A Boolean formula f is in the conjunctive normal form (CNF) if it is a conjunction of a set clauses. For a CNF formula ϕ, we use Vars(ϕ) and Cls(ϕ) to denote the set of variables and clauses appear in f. A Boolean function f over variables X is a mapping f : B|X| B. An assignment τ is a mapping τ : X B. The induced formula over an assignment τ, denoted as f|τ, is obtained by replacing each variable with its value mapped by τ. The onset of a Boolean function f, denoted as f +, is the set of assignments {τ | f|τ = }. The offset of f, denoted as f , is the set of assignments {τ | f|τ = }.

Model Counting

Given a CNF formula ϕ, the unweighted model counting problem is to find the number of satisfying assignments of ϕ, denoted as #(ϕ). The weighted model counting problem can be easily extended from the unweighted model counting by assigning each literal a weight. The weight of an assignment is the product of the weight of the literals that appear in the assignment. The weighted model counting is to find the summation of the weight of the assignments.

Stochastic Boolean Satisfiability

An SSAT formula Φ over variables X = X1 X2 Xn with Xi = and Xi Xj = for i = j and i, j {1, , n}, can be expressed in the prenex conjunctive normal form (PCNF)

Q1X1, , Qn Xn.φ (1)

where Q1X1, , Qn Xn is called the prefix, for Qi { ,

} being either an existential quantifier or a randomized quantifier

, and φ being a CNF formula is called the matrix. The quantification level of a variable x, denoted by qlev(x), is i if x Xi. We refer to the set X = {x Xi | Qi = } as the existential variables and the set X

= {x Xi | Qi =

} as the randomized variables. When a variable x is quantified by randomized quantifier

px, it means that x is assigned to (resp. ) with probability px (resp. 1 px). Given a variable x at the outermost quantifier, the satisfying probability of the SSAT formula Φ is computed by the following rules.

1. Pr[ ] = 1, 2. Pr[ ] = 0, 3. Pr[Φ] = max{ Pr[Φ| x], Pr[Φ|x] } if x X , 4. Pr[Φ] = (1 px) Pr[Φ| x] + px Pr[Φ|x] if x X

In solving an SSAT formula, we aim to compute its maximum satisfying probability. On the other hand, for witness generation, the objective is to derive Skolem functions for the existential variables to attain the claimed satisfying probability. Formally, given an SSAT formula Φ in the form of Eq. (1), we say the set of Skolem functions F = {fx | x X }, for fx depending on the set of randomized variables Yx = {y | y X

, qlev(y) < qlev(x)}, is a witness for probability p if the SSAT formula Φ obtained from Φ by substituting each existential variable x in the matrix of Φ with fx has satisfying probability Pr[Φ ] = p.

Example 1. Consider the SSAT formula

0.3x2, y2. ( y1 x1)(y1 x1) | {z } y1 x1

( y2 x1)( y2 x2)(y2 x1 x2) | {z } y2 (x1 x2)

The Skolem functions F = {fy1(x1) = , fy2(x1, x2) = x1 x2} is a witness for probability 0.4 since by substituting y1 with and y2 with x1 x2, the matrix becomes (x1), and Pr[

0.4x1. (x1)] = 0.4. We note that, in fact, the maximum satisfying probability of this formula is 1.

For an SSAT formula Φ, a witness is associated with a probability p. Because p is not necessarily equal to the maximum satisfying probability Pr[Φ], a witness may not be strong enough to certify the maximum satisfying probability. However, it certifies a lower bound of Pr[Φ]. In this work, when a solver concludes a probability p for an SSAT formula, we are concerned with deriving the witness for the claimed probability p.

3 SSAT Solving via Component Analysis and Clause Learning

We demonstrate the SSAT solving algorithm incorporating advanced model counting techniques as shown in Algorithm 1. The algorithm differs from Prime (Salmon and Poupart 2019) in the highlighted lines. Algorithm 1 takes an SSAT formula Φ as input and returns the satisfying probability pret. It starts with a Boolean constraint propagation on the formula in Line 1. After that, it decomposes the current formula into several sub-components in Line 2. For each sub-component Φi, it first checks if the component is in the cache. If the current component is in the cache, it will update pret with the cached probability and continue to the next component. Otherwise, it chooses a branching literal x that appears in Φi and then performs constant propagation to simplify the positive cofactor Φi|x and the negative cofactor Φi| x. In Lines 11 and 12, it recursively calls the procedure to get the satisfying probabilities, p0 and p1, of the two branches. According to the quantifier of the branching literal, it concludes that the satisfying probability of Φi is either the maximum or weighted sum of p0 and p1. Upon solving the probability of Φi, it updates the return probability pret accordingly. When all the sub-components are processed, it adds the cache entry (Φ, pret) to the cache to avoid re-computation.

Algorithm 1: solve SSAT Input: An SSAT formula Φ Output: satisfying probability pret 1: bcp(Φ) 2: component Analysis(Φ) 3: pure Elimination(Φ) 4: pret 1 5: for Φi Components(Φ) do 6: p 0 7: if in Cache(Φi) then 8: pret pret get Prob From Cache(Φi) 9: continue 10: x choose Branch Lit(Φi) 11: p0 solve SSAT(Φi|x) 12: p1 solve SSAT(Φi| x) 13: if x X then 14: p max(p0, p1) 15: else 16: p Pr[x] p0 + Pr[ x] p1 17: if p = 0 then 18: remove Polluted Cache(Φi) 19: add Learned Clause() 20: pret 0 21: break 22: pret pret p 23: add To Cache(Φ, pret) 24: return pret

Component Analysis and Caching Component analysis is a technique that is widely used in search-based model counter (Thurley 2006; Sang et al. 2004). It sees a formula Φ as a connectivity graph and applies breadth-first search to identify connected components, say Φ1, Φ2, . . . , Φn, such that Vars(Φi) Vars(Φj) = . Then the model count #(Φ) of Φ equals the products of the model counts of its components, i.e.,

i=1 #(Φi). (2)

Salmon and Poupart observe that such a property also holds in SSAT:

Theorem 1. Given an SSAT formula Φ in the form of Eq. (1), if φ = φ1 φ2 φk such that Vars(φi) Vars(φj) = , and let Φi = Q1X1 Qn Xn, φi, then Pr[Φ] = Qk i=1 Pr[Φi].

Hence, by Theorem 1, we can solve the original SSAT formula Φ by decomposing it into smaller sub-components Φ1, Φ2, , Φn and solve each of them independently. Caching is another technique that records the previously solved sub-problems to avoid solving the same sub-problem repetitively (Sang et al. 2004). It has been shown that different coding of the component will significantly affect the cache miss rate and memory usage (Thurley 2006). In our implementation, we use the hybrid-coding scheme proposed in (Thurley 2006).

Clause Learning Clause learning records the conflict information as a learned clause to effectively prune the UNSAT search space (Marques-Silva and Sakallah 1999). Adding learned clauses to the original formula can strengthen the power of Boolean constraint propagation. In Sharp SSAT, the learned clause would be added to the matrix of the SSAT formula whenever a conflict occurs. However, as pointed out by (Sang et al. 2004), we may cache some components with the wrong model count when encountering an UNSAT component in model counting. The same problem occurs in SSAT solving. Therefore, when a sub-component Φi is UNSAT, all the cached sub-components derived from Φ1 to Φi 1 need to be removed from the cache. In Algorithm 1, the function remove Polluted Cache does this job in Line 18.

Pure Literal Detection A literal l is called pure in the formula Φ if l appears in Φ but not l. If l is pure and var(l) X , we can immediately assign l to and eliminate all the clauses containing l, called the occurrence clauses. In QBF solving, the detection of pure literal is commonly implemented using a clause-watching data structure (Gent et al. 2003) as it does not need to be updated during backtrack. In Sharp SSAT, we observe that pure literals can be detected during component analysis since all literals appearing in the formula will be visited once without the clause-watching data structure. Although more pure literals may appear and can be detected by iteratively applying component analysis after eliminating the clauses involving a detected pure literal, the extra component analysis steps can be too costly. Therefore, we implement the detection and elimination in one pass. Empirical results show that such implementation is still effective and introduces little run-time overhead.

Early Return at Existential Quantifier With the component analysis scheme, the update of satisfying probability is monotonically decreasing since pret is multiplied by a probability, which is smaller than 1, in every sub-component iteration (Line 22 in Algorithm 1). Since at existential quantifier the returned probability is the maximum among the two branches, given the probability of the first branch Pr[Φ| v], whenever an update makes the probability of the current branch pret Pr[Φ| v], Sharp SSAT will immediately conclude Pr[Φ] = Pr[Φ| v] without solving the remaining sub-components. Example 2. Consider the SSAT formula Φ:

0.5x2, x3, y1, y2, y3,

0.6x4. (x1 y1)( y1 y2 y3)(y1 y2 x4)( y2 y3) (y2 y3 x4)(y3 x4)(x2 x3),

Algorithm 1 starts from the x1 branch. Φ|x1 is then decomposed into Φ1 = ( y1 y2 y3)(y1 y2 x4)( y2 y3)(y2 y3 x4)(y3 x4) and Φ2 = (x2 x3). Now Φ1 is solved first. At the y1y2 branch, since Φ1| y1y2 is UNSAT, a learned clause (y1 y2) is added to the matrix. At the y1 y2 branch, where Φ1| y1 y2 = (y3 x4)(y3 x4), literal y3 is detected as pure and Pr[Φ1| y1 y2] is concluded to be 1. Since y2

is existential, Pr[Φ1| y1]= max(0, 1) = 1. As Pr[Φ1| y1]= 1, Pr[Φ1] is directly concluded to be 1 without solving Φ1|y1. Similarly, Pr[Φ2] is valuated to 0.75 and Pr[Φx1]= 1 0.75 = 0.75. Algorithm 1 then explores the x1 branch, where Φ| x1 = (y3 x4)(y3 x4)(x2 x3). After a component analysis, two cache hits occur for (y3 x4)(y3 x4) and (x2 x3). Pr[Φ x1] equals 0.75. Since x1 is randomized, the returned probability Pr[Φ]= 0.4 0.75+0.6 0.75 = 0.75.

4 Sharp SSAT Witness Generation In this section, we focus on witness generation for Sharp SSAT. The algorithm for Sharp SSAT witness generation proceeds in two phases: It first records a trace corresponding to the solving steps and second constructs a Skolem-function witness from the trace.

Trace Representation A trace is represented as a single-source directed-acyclic graph (DAG). Each node n corresponds to a formula n.f. We use Parents(n) and Children(n) to denote the set of the parents and children of n, respectively. A trace has two constant sink nodes, the zero-node and the one-node with the associated formula and , respectively. On the other hand, there are two different types of non-leaf nodes: the decisionnode and the and-node, which correspond to a branching step and a component analysis step, respectively, during SSAT solving. Each decision-node n is associated with a decision variable n.v and has exactly two children, a positive one and a negative one with respect to their parents n. Note that for a non-root node, it can be a negative child of a parent and, at the same time, a positive one of another parent. For a decision-node n with its variable n.v, the formula of its positive child and negative child are n.f|v and n.f| v, respectively, where n.f|v and n.f| v can be obtained from n.f by constant propagation replacing variable n.v with and , respectively. For each decision Sharp SSAT made, the corresponding decision-node and the existential literals implied by BCP due to the decision are recorded for later witness extraction. On the other hand, an and-node corresponds to a component analysis step and can have one or more children, each representing one sub-component. Consider the case where an and-node n has exactly one child, meaning that n.f itself is a connected component. Such and-node is redundant as its formula is identical to its child s, and thus can be omitted by connecting its parents to its child. Given an and-node n and its children Children(n), the relationship of their corresponding formulas is

c Children(n) c.f.

Also, the children of an and-node must be decision-nodes. With the above definition, a trace can characterize the solving steps of the Sharp SSAT procedure for witness extraction. Example 3. Consider the SSAT formula in Example 2. The resulting trace can be constructed as shown in Fig. 1. The

Φ = (x1 y1)( y1 y2 y3) (y1 y2 x4) ( y2 y3)(y2 y3 x4)(y3 x4)(x2 x3)

y2 ( y2 x4)( y2 y3) (y2 y3 x4)(y3 x4)

y3 (y3 x4)(y3 x4)

Figure 1: The trace in Example 3

rectangular nodes are the constant sink nodes, the circular nodes are decision-nodes, and the elliptic nodes are andnodes. We note that the multiple incoming edges of nodes x2 and y3 are due to caching. To support caching, we extend the hybrid-code mentioned in the previous section with an extra pointer pointing to the node representing the cached component. When there is a cache hit, the trace can be updated by adding an edge from the current node to the cached node.

Unfortunately, a naive implementation of the abovementioned trace becomes impractical regarding memory usage when a solving procedure includes hundreds of thousands of decisions. In our implementation, we observe that any and-node in the trace can be eliminated by directly connecting its parent to its child and marking these children negative (positive) if this and-node is a negative (positive) child of its parent. The resulting trace consists of decisionnodes only. On the other hand, recall that the descendants of the current node, corresponding to the current solving steps, would be removed from the trace when the function remove Polluted Cache is called. To reuse the memory occupied by those removed nodes, we maintain a reference count for each node and periodically free those nodes with zero reference count.

Witness Extraction from Trace Upon the solving procedure is completed, we then extract a Skolem-function witness from the trace. The witness is constructed by traversing the trace once in topological order. The procedure is shown in Algorithm 2. We assumed that the Skolem function fx for each existential variable x is globally accessible. For each node n, the formula n.τ is maintained to characterize the precondition (in terms of randomized variables) when this decision is made. When a node n is visited, its precondition is the disjunction over all the preconditions of Parents(n) (Line 5). Suppose the decision variable x is existential. In that case, it will update the Skolem functions with λ by the function updte Skolem Function shown in Algorithm 3, according to the decision branch (n.max Branch) that leads to a greater satisfying probability and the implied

Algorithm 2: extract Strategy From Trace Input: a trace Output: {fx | x X } 1: for each node n trace do 2: n.τ 3: for x X do 4: fx 5: for each node n trace in topological order do 6: x n.v 7: λ W

p Parents(n) p.τ

8: if x X then 9: if n.max Branch = then 10: L exist. literals implied by x 11: update Skolem Function(L {x}, λ) 12: else 13: L exist. literals implied by x 14: update Skolem Function(L { x}, λ) 15: n.τ λ 16: else 17: for d Children(n) do 18: if d is positive then 19: λ λ x 20: L exist. literals implied by x 21: update Skolem Function(L, λ) 22: else 23: λ λ x 24: L exist. literals implied by x 25: update Skolem Function(L, λ) 26: d.τ λ 27: return {fx | x X }

Algorithm 3: update Skolem Function Input: a set of literals L, the precondition λ Output: void 1: for l L do 2: x var(l) 3: if l is positive then 4: f + x f + x λ 5: else 6: f x f x λ 7: return

existential literals L. For the case when x is randomized, the Skolem functions are constructed similarly, except that the decision of x must be included in the precondition.

Proposition 1. Algorithm 2 generates a set of Skolem functions as a witness to the satisfying probability p returned by Sharp SSAT.

The correctness of the proposition can be established by induction on the structure of the trace. We consider Proposition 1 on the associated formula of a node and process the nodes in reverse topological order. The induction starts from the base case, the decision nodes with two constant children. The induction hypothesis assumes that the proposition holds at the k-th visited node. Now we consider the k+1-th visited

node n. Where n.v is existential, the Skolem functions of its descendants remain unchanged, while the Skolem function of n.v is or depending on the decision that leads to a greater probability. Otherwise, the Skolem functions of the descendants of its positive child and negative child have to be strengthened by conjunction with n.v or n.v depending on the decision branch. It can be checked that with this update, the resulting Skolem functions are the same as those generated by Algorithm 2. Finally, by induction, the proposition holds in the last processed node, which is the only source node in the trace.

5 Clau SSat Witness Generation

In addition to witnessing the probability returned by Sharp SSAT, we develop the witness generation algorithm for Clau SSat. We first give some background about the clause selection mechanism of Clau Ssat before elaborating the witness generation procedure as follows.

Clause Selection

The solver Clau SSat (Chen, Huang, and Jiang 2021) extends the clause selection technique of quantified Boolean formula (QBF) (Janota and Marques-Silva 2015) to SSAT. It introduces local and global selection variables for each clause to track the satisfaction of clauses in each quantification level. We follow the notation used in (Chen, Huang, and Jiang 2021). Given an SSAT formula of Eq. (1), where φ = C1 C2 Cn, the subclause of Ci with respect to some quantification level j, denoted by C j i , is the set of literals {l | l Ci, qlev(l) j}, where {<, >, , , =}. In the sequel, the = in C=j i is omitted as Cj i for simplicity. The local selection variable and global selection variable can be defined as follows.

Definition 5.1 (Local and global selection variable). The local selection variable tj i for clause Ci at quantification level j is defined and constrained by

tj i Cj i .

The global selection variable for clause Ci at quantification level j is defined and constrained by

sj i C j i .

A clause Ci is said to be locally selected (resp. locally deselected) at quantification level k if tk j is evaluated (resp. ). Similarly, a clause is said to be globally selected (resp. globally deselected) at quantification level k if sk j is evaluated (resp. ). The global selection variables describe the selection status up to the current quantification level, i.e., which clauses are already satisfied and which are not. The local selection variables further characterize that the clauses are satisfied at which quantification levels. The reader is referred to (Chen, Huang, and Jiang 2021) for detailed information of clause selection.

Algorithm 4: solve SSAT-

(Chen, Huang, and Jiang 2021) Input: Φ =

Xj Qn Xn.ϕ Output: p : satisfying probability 1: p 0 2: τ 3: CL 4: if j = n then 5: if SAT(ϕ) = then 6: p Weighted Model Count(

Xn.ϕ) 7: else 8: V 9: ψj(X1, T1) V

Ci ϕ (tj i C1 i )

10: ψj+1(X2, T2) V

Ci ϕ (tj+1 i C1 i )

11: while SAT(ψ1 = ) do 12: τj the found model of ψj for Xj 13: (p, τj+1) Solve SSAT- (Φ|τj) 14: τ j+1 Maximal Pruning(ψj|τj, ψj+1, τj+1) 15: c Tj Prune Selection(ψj|τj, ψj+1|τ j+2)

16: V.Collect Probability Selection Cubes Pair(p, c Tj) 17: CL c Tj 18: ψ1 ψ1 CL 19: if p = 0 then 20: Add Learnt Clauses To Prior Levels(CL) 21: else 22: Construct Witness(CL, τj+1, j) 23: p Compute Probabiity(V ) 24: return p

Algorithm 5: construct Witness Input: CL, τmax, j Output: {fx | x Xj+1} 1: λ CL 2: if j 1 then 3: λ λ V

sj 1 i = sj 1 i V

sj 1 i = sj 1 i

4: for l τmax do 5: x var(l) 6: if sign(l) then 7: f x f x λ 8: else 9: f + x f + x λ 10: return {fx | x Xj+1}

Witness Generation Witness generation under Clau SSat is shown in Algorithm 5. Algorithm 5 takes three arguments returned: the learned clause CL, the maximum assignment τmax, and the quantification level j. Considering an induced SSAT formula

Xj Xj+1 Qn Xn.ϕ, Algorithm 5 is invoked by Solve SSAT-

(Chen, Huang, and Jiang 2021) shown in Algorithm 4 (line 22) at quantification level j whenever the assignment τmax is returned by Solve SSAT- (Chen, Huang, and Jiang 2021). Under the clause selection framework, CL consists of disjunctions of local selection literals. Several

pruning techniques are dedicated to improve the quality of CL since the performance of Clau SSat largely depends on the length of CL. Here we assume that CL is the final learned clause strengthened by those techniques. Algorithm 5 can be interpreted as follows. Given the global selection status Sj 1 at level j 1, the maximum probability can be achieved by τmax, which is an assignment over Xj+1 under the assumption of the Boolean space blocked by CL. By adding Sj 1 CL into the onset/offset of the existential variables according to τmax, Algorithm 5 can correctly construct the Skolem functions for Xj+1.

Proposition 2. Clau SSat with Algorithm 5 generates a set of Skolem functions as a witness of the obtained satisfying probability.

The validity of Proposition 2 is explained as follows. To generate the Skolem functions, Algorithm 5 has to ensure that the Boolean space that the Skolem functions already defined on must be the same as that explored by Clau SSat during solving. For two-level SSAT formula with randomexist prefix, CL exactly characterizes the newly explored Boolean space at the randomized quantification (Line 1). For general SSAT formula, the Boolean space has to be strengthened by conjunction with the assumptions (the selection status at the previous level, Sj 1) that lead to the current induced formula as shown in line 3. With the newly explored Boolean space λ, the onset or offset of the Skolem functions can be updated by conjunction with λ (Lnes 6 9) according to the phase of each literal appearing in τmax.

6 Experimental Evaluation

We implemented the SSAT solver, named Sharp SSAT1 , in C++ based on the model counter Sharp SAT (Thurley 2006). The witness generation algorithms were integrated into Sharp SSAT and Clau SSat. All experiments were conducted on a Linux machine with 2.2 GHz Intel Xeon CPU and 128 GB RAM. A time limit of 1000 seconds and a 64GB memory limit were imposed on solving an instance. We compared our solver with the state-of-the-art SSAT solvers DC-SSAT, Clau SSat, Prime, and Elim SSAT. The benchmark set used in Clau SSat, with 20 families in total of 357 SSAT instances available at https://github. com/NTU-ALCom Lab/Clau SSat, was taken for evaluation. Among the 20 families (listed in Table 1), 8 of them are twolevel quantified formulas (listed in the lower part of Table 1) and the other 12 are multi-level quantified instances (listed in the upper part of Table 1). The detailed descriptions and statistics of the benchmarks are omitted and can be found in (Chen, Huang, and Jiang 2021). Except for Sharp SSAT, we turned on all the optimization options for all the other solvers. We let the default option of Sharp SSAT contain component analysis, caching, pure-literal detection, and clause learning, and use -p and -l options to disable pureliteral detection and disable clause learning, respectively. The cactus plot of Sharp SSAT and other solvers is shown in Fig. 2.

1Available at https://github.com/NTU-ALCom Lab/Sharp SSAT

Sharp SSAT Prime DC-SSAT Clau SSat Elim SSAT Family total -pl -l default tlc 13 13 13 13 13 13 13 13 gttt 3x3 9 9 9 9 9 9 9 0 ev-pr-4x4 7 7 7 7 4 0 1 1 arbiter 10 0 0 0 0 0 0 0 Tree 14 14 14 14 14 14 14 14 Robot 10 3 4 5 4 3 10 5 Planning-CTE 18 0 0 0 0 0 0 2 Counter 8 6 8 8 6 4 4 8 Connect2 16 16 16 16 9 16 11 2 Adder 6 4 5 5 5 4 5 4 k branch n 10 10 10 10 3 2 1 2 k ph p 4 4 4 4 3 3 2 4 conformant 24 2 2 3 2 2 1 6 Tiger 5 5 5 5 4 5 2 5 Toilet A 77 59 63 64 59 42 47 77 sand-castle 25 23 23 23 20 24 11 14 Max Count 25 10 11 11 9 4 1 10 MPEC 8 4 4 4 4 4 1 8 PEC 8 3 4 4 3 0 3 7 stracomp 60 16 16 28 14 28 60 44 all 357 208 218 232 186 177 196 226

Table 1: Performance comparison in the number of solved instances.

Figure 2: Number of solved instances within run-time limit.

Evaluation of Solver Performance In the first experiment, we examine the number of solved instances in each family. The statistics are listed in Table 1. The largest number of solved instances in each row is marked in bold. By comparing Sharp SSAT -p (solving 218 in total) with Sharp SSAT -pl (solving 208), the effectiveness of clause learning is seen as it improves in solving several families. By comparing Sharp SSAT (solving 232) with Sharp SSAT -p, the effectiveness of pure-literal detection is especially significant in family stracomp, where there are several long clauses containing only one randomized literal and other positive-phase existential literals, and these clauses can be eliminated when the existential lit-

erals are detected as pure. By comparing Sharp SSAT with other DPLL-based solvers, Prime and DC-SSAT, it is seen that Sharp SSAT solves the most instances in almost all families, except for the sand-castle family. The superiority of Sharp SSAT to Prime is expected as Prime, which only applies component analysis and caching, neglects the implication power strengthened by clause learning. We also noticed that Prime selects the branch-literal blindly, i.e., it always chooses the first occurred literal in the component to branch. This strategy might be one of the reasons to blame for its sub-optimal performance as the design of the branching heuristic has a great impact on the performance of such DPLL-based solvers (Sang, Beame, and Kautz 2005). On the other hand, although DC-SSAT can deal with general SSAT formulas, it is originally designed for probabilistic planning problems. From the results, we see that Sharp SSAT outperforms DC-SSAT in those families that are not planning problems. For the planning families, such as stracomp, sand-castle, and Tiger, Sharp SSAT exhibits quite similar strength to DC-SSAT, suggesting that the decomposability of the planning problem exploited by DC-SSAT can also be detected through component analysis. Next, we compare Sharp SSAT with the abstractionbased solver Clau SSat. Sharp SSAT can solve 75 cases that are not solvable by Clau SSat and Clau SSat can solve 38 cases that are not solvable by Sharp SSAT. When inspecting the number of solved instances in each family, we noticed that Sharp SSAT outperforms Clau SSat in most families, while Clau SSat performs especially well in the Robot and stracomp families, where both of them consist of two-level formulas with the random-exist prefix.

Configuration # solved Random 215 VSIDS 232 VSADS 232

Table 2: Comparison of branching heuristics in Sharp SSAT.

It is observed that when a component consists of only existential literals, called an existential component, the SSAT problem of such components reduces to an SAT problem. However, the current implementation of Sharp SSAT is not aware of such existential components, and it will insist on component analysis instead, which might be too expensive for an SAT problem. We believe that it would be helpful as our future work for Sharp SSAT to more efficiently solve these random-exist formulas if the probability of existential components can be determined more effectively, e.g., by conflict-driven-clause-learning (CDCL). Lastly, we compare Sharp SSAT with Elim SSAT. Note that because Elim SSAT requires probability values representable in binary fractions, it rewrites SSAT formulas to approximate probability values with four-bit precision by default. Therefore, the comparison is not on an equal footing but for a reference. It can be seen that Sharp SSAT performs slightly better than Elim SSAT in the total number of solved instances. When inspecting by family, we observed that Sharp SSAT outperforms Elim SSAT in most multilevel quantified families and Elim SSAT solves more instances in the two-level quantified families. The results suggest that Sharp SSAT and Elim SSAT exhibit different capabilities in solving multi-level and two-level quantified formulas.

Evaluation of Sharp SSAT Branching Heuristics

Since Sharp SSAT is a DPLL-based solver, the branching heuristics are critical to its performance. In this experiment, we empirically study the effectiveness of different branching heuristics on SSAT solving. The default branching heuristics of Sharp SSAT is the variable state aware decaying sum (VSADS) (Sang, Beame, and Kautz 2005) that the score for each variable is calculated by

score(VSADS) = p score(VSIDS) + q score(DLCS),

where p and q are constants weighting the scores of VSIDS variable state independent decaying sum (VSIDS) (Mahajan, Fu, and Malik 2004) and dynamic largest combined sum (DLCS) (Sang, Beame, and Kautz 2005). The score functions of different heuristics can be found in (Sang, Beame, and Kautz 2005). We tested Sharp SSAT with different configurations of branching strategies, namely Random (p = 0, q = 0), VSIDS (p = 1, q = 0), and VSADS (p = 1, q = 1). The results are reported in Table 2. From Table 2, we observe that the branching heuristics can indeed play a significant role in SSAT solving by comparing Random with VSIDS. On the other hand, we see no significant impact of the DLCS score on SSAT solving.

Figure 3: Witness size comparison in text file size.

Evaluation of Witness Generation

In this experiment, we evaluate the overhead of witness generation under Sharp SSAT and Clau SSAT and the quality of the generated witnesses. For both solvers, the generated witness circuits are written in the BLIF format (Sentovich et al. 1992). Let -k denote the witness generation option. Sharp SSAT -k solved all the instances solved by Sharp SSAT while Clau SSat -k solved 10 less instances than Clau SSat. For memory usage, Sharp SSAT -k typically consumed twice the memory used by Sharp SSAT while Clau SSat -k remained the same as Clau SSat. We further verify each instance using Cachet (Sang et al. 2004) under a time limit of 300 seconds. For Sharp SSAT -k (resp. Clau SSat -k), 3 (resp. 21) out of the solved instances cannot be verified within this time limit. To evaluate the quality of the witness, we synthesized the generated witness into AIG using ABC (Brayton and Mishchenko 2010). The synthesized witness is further optimized with the ABC command dc2. The witness comparison in the BLIF file size and the synthesized AIG size are shown in Fig. 3 and Fig. 4, respectively. In Fig. 3 (resp. Fig. 4), the horizontal and vertical 107 (resp. 108) lines correspond to timeouts for Sharp SSAT and Clau SSat, respectively. The horizontal axis corresponds to the file size (in KB) or the AIG size (in AIG node count) of the witness generated by Clau SSat -k, and the vertical axis corresponds to those generated by Sharp SSAT -k. As can be seen, in both figures, most of the spots lie in the bottom right half-plane, suggesting that Sharp SSAT-k tends to produce smaller witnesses in most cases. In summary, Sharp SSAT -k incurs very little runtime overhead and tends to generate a witness with higher quality when compared to Clau SSat -k. On the other hand, the memory usage of Sharp SSAT -k is higher than

Figure 4: Witness size comparison in AIG node count.

Clau SSat -k. This is not surprising since the strategy of Clau SSat -k and Sharp SSAT -k are pretty different. The witness construction of Clau SSat -k is bottom-up while Sharp SSAT-k is more like a top-down approach. Therefore, Sharp SSAT -k can easily prune irrelevant information, such as the smaller branch of existential variable, while Clau SSat -k can not. We observed that the redundancy in the witness generated by Clau SSat -k is the reason why the AIG synthesis step closes the gap between Sharp SSAt -k and Clau SSat -k when comparing Fig. 3 with Fig. 4, as the synthesis step may exploit the redundancy and then produce a smaller witness. Such difference also leads to the trade-off between run-time and memory. In order to have global information of witness, Sharp SSAT maintains a trace until the end of solving, which may significantly increase the memory usage. On the other hand, with the trace, Sharp SSAT can efficiently construct the witness by traversing the trace once.

7 Conclusions and Future Work

In this work, we have implemented Sharp SSAT, a new SSAT solver extending model counting techniques to SSAT solving. Also, we have equipped Sharp SSAT and Clau SSat with witness generation capability. Experimental results have demonstrated the superiority of Sharp SSAT compared to state-of-the-art SSAT solvers. For future work, we plan to further improve Sharp SSAT with existential component detection and explore techniques in knowledge compilation (Darwiche 2004) to benefit Sharp SSAT.

Acknowledgements

The authors are grateful to Ricardo Salmon for his technical assistance about using Prime. This work was supported in part by the National Science and Technology Council of Tai-

wan under Grant 111-2221-E-002-182 and Grant 111-2923E-002-013-MY3.

References Brayton, R.; and Mishchenko, A. 2010. ABC: An Academic Industrial-Strength Verification Tool. In Proceedings of International Conference on Computer Aided Verification (CAV), 24 40. Chen, P.-W.; Huang, Y.-C.; and Jiang, J.-H. R. 2021. A Sharp Leap from Quantified Boolean Formula to Stochastic Boolean Satisfiability Solving. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 3697 3706. Darwiche, A. 2004. New Advances in Compiling CNF to Decomposable Negation Normal Form. In Proceedings of European Conference on Artificial Intelligence (ECAI), 328 332. Freudenthal, E.; and Karamcheti, V. 2003. QTM: Trust Management with Quantified Stochastic Attributes. Technical Report TR 2003-848, CS Department, New York University. Gent, I.; Giunchiglia, E.; Narizzano, M.; Rowley, A.; and Tacchella, A. 2003. Watched Data Structures for QBF Solvers. In Proceedings of International Conference on Theory and Applications of Satisfiability Testing (SAT), 25 36. Ghosh, B.; Basu, D.; and Meel, K. S. 2021. Justicia: A Stochastic SAT Approach to Formally Verify Fairness. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 7554 7563. Janota, M.; and Marques-Silva, J. 2015. Solving QBF by Clause Selection. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 325 331. Lee, N.-Z.; and Jiang, J.-H. R. 2018. Towards Formal Evaluation and Verification of Probabilistic Design. IEEE Transactions on Computers (TCOMP), 67(8): 1202 1216. Littman, M. L.; Majercik, S. M.; and Pitassi, T. 2001. Stochastic Boolean Satisfiability. Journal of Automated Reasoning (JAR), 27(3): 251 296. Mahajan, Y. S.; Fu, Z.; and Malik, S. 2004. Zchaff2004: An Efficient SAT Solver. In Proceedings of International Conference on Theory and Applications of Satisfiability Testing (SAT), 360 375. Majercik, S. M.; and Boots, B. 2005. DC-SSAT: A Divideand-Conquer Approach to Solving Stochastic Satisfiability Problems Efficiently. In Proceedings of National Conference on Artificial Intelligence (AAAI), 416 422. Marques-Silva, J. P.; and Sakallah, K. A. 1999. GRASP: A Search Algorithm for Propositional Satisfiability. IEEE Transactions on Computers (TCOMP), 48(5): 506 521. Salmon, R.; and Poupart, P. 2019. On the Relationship between Stochastic Satisfiability and Partially Observable Markov Decision Processes. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 1 407. Sang, T.; Bacchus, F.; Beame, P.; Kautz, H. A.; and Pitassi, T. 2004. Combining Component Caching and Clause Learning for Effective Model Counting. In Proceedings of International Conference on Theory and Applications of Satisfiability Testing (SAT), 20 28.

Sang, T.; Beame, P.; and Kautz, H. 2005. Heuristics for Fast Exact Model Counting. In Proceedings of International Conference on Theory and Applications of Satisfiability Testing (SAT), 226 240. Sentovich, E.; Singh, K.; Lavagno, L.; Moon, C.; Murgai, R.; Saldanha, A.; Savoj, H.; Stephan, P.; Brayton, R. K.; and Sangiovanni-Vincentelli, A. L. 1992. SIS: A System for Sequential Circuit Synthesis. Technical Report UCB/ERL M92/41, EECS Department, University of California, Berkeley. Thurley, M. 2006. sharp SAT Counting Models with Advanced Component Caching and Implicit BCP. In Proceedings of International Conference on Theory and Applications of Satisfiability Testing (SAT), 424 429. Wang, H.-R.; Tu, K.-H.; Jiang, J.-H. R.; and Scholl, C. 2022. Quantifier Elimination in Stochastic Boolean Satisfiability. In Proceedings of International Conference on Theory and Applications of Satisfiability Testing (SAT), 1 17.