# maxsat_resolution_with_the_dual_rail_encoding__c84e9c6a.pdf

Max SAT Resolution with the Dual Rail Encoding

Maria Luisa Bonet,1 Sam Buss,2 Alexey Ignatiev,3,4 Joao Marques-Silva,3 Antonio Morgado3

1Computer Science Department, Universidad Polit ecnica de Catalu na, Barcelona, Spain 2Department of Mathematics, University of California, San Diego, USA 3LASIGE, Faculdade de Ciˆencias, Universidade de Lisboa, Portugal 4ISDCT SB RAS, Irkutsk, Russia

Conﬂict-driven clause learning (CDCL) is at the core of the success of modern SAT solvers. In terms of propositional proof complexity, CDCL has been shown as strong as general resolution. Improvements to SAT solvers can be realized either by improving existing algorithms, or by exploiting proof systems stronger than CDCL. Recent work proposed an approach for solving SAT by reduction to Horn Max SAT. The proposed reduction coupled with Max SAT resolution represents a new proof system, DRMax SAT, which was shown to enable polynomial time refutations of pigeonhole formulas, in contrast with either CDCL or general resolution. This paper investigates the DRMax SAT proof system, and shows that DRMax SAT p-simulates general resolution, that AC0Frege+PHP p-simulates DRMax SAT, and that DRMax SAT can not p-simulate AC0-Frege+PHP or the cutting planes proof system.

Introduction The practical success of Conﬂict-Driven Clause Learning (CDCL) SAT solvers hinges on what can be construed as a relatively weak proof system, at least when compared with several others (Beame and Pitassi 2001; Buss 2012; Nordstr om 2015). This proof system (CDCL) is as powerful as general resolution (RES). One approach to improving SAT solvers is to exploit proof systems stronger than CDCL/RES; however, it is open whether a proof system stronger than CDCL/RES can yield more efﬁcient SAT solvers, as attempts at exploiting extended resolution (Huang 2010; Audemard, Katsirelos, and Simon 2010) or cutting planes in SAT solvers (Nordstr om 2015) have been so far largely unsuccessful. Furthermore, a key issue from a practical perspective is whether stronger proof systems are automatizable (Bonet, Pitassi, and Raz 2000), and unfortunately, most results regarding automatizability are negative (Bonet, Pitassi, and Raz 2000). Recent work (Ignatiev, Morgado, and Marques-Silva 2017) proposed a different take on SAT solving. Propositional formulas can be re-encoded, using a variant of

This work was ﬁnancially supported in part by TIN201676573-C2-2-P and by FCT funding of post-doctoral grants SFRH/BPD/103609/2014, SFRH/BPD/120315/2016, and LASIGE Research Unit, ref. UID/CEC/00408/2013. Copyright c 2018, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

the well-known dual rail encoding (Bryant et al. 1987; Palopoli, Pirri, and Pizzuti 1999) and then refuted with a Max SAT solver, e.g. Max SAT resolution (Larrosa and Heras 2005; Bonet, Levy, and Many a 2007) or core-guided Max SAT (Morgado et al. 2013). The re-encoded formulas are polynomial size (in fact linear). Somewhat surprisingly, the propositional encoding of the pigeonhole principle (PHP), if reencoded with the modiﬁed dual rail encoding, can be refuted in polynomial time, both with Max SAT resolution and with core-guided Max SAT (Ignatiev, Morgado, and Marques-Silva 2017)1. In contrast, the ordinary (non-dual-rail encoded) PHP formulas have exponential lower bounds for RES and CDCL (Haken 1985; Beame, Kautz, and Sabharwal 2004), but also for Max SAT resolution (Bonet, Levy, and Many a 2007). This paper begins the process of charting the relative efﬁciency of the dual rail Max SAT (DRMax SAT) proof system. The ﬁrst result is that DRMax SAT p-simulates general resolution, and thus since DRMax SAT has short proofs of the PHP, it is a strictly stronger proof system than either CDCL or RES. The paper also shows that DRMax SAT cannot p-simulate either of AC0-Frege+PHP or cutting planes. Finally, the paper investigates a variant of the pigeonhole principle (Biere 2013a). In practice, this variant is much harder than plain PHP formula. Experimental results on formulas encoding this variant of PHP conﬁrm that the practical implementation of DRMax SAT outperforms modern CDCL SAT solvers.

Preliminaries

Max SAT and Weighted Max SAT. Max SAT is the problem of ﬁnding an assignment that minimizes the number of falsiﬁed clauses of a CNF formula. Max SAT has several generalizations. To deﬁne them, we need to give weights to clauses, with the weight indicating the cost of falsifying the clause. A weighted clause is written (A, w) where A is a clause and w {1, 2, 3, . . .} { }. The value is viewed

1Earlier work (Sabharwal 2005) showed that problem-speciﬁc symmetry-breaking can serve to reduce refutations of PHP from exponential to polynomial. In contrast, the results in (Ignatiev, Morgado, and Marques-Silva 2017) exploit a general reduction to Horn Max SAT.

The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)

as equaling inﬁnity, but we write instead of . A typical use of weighted clauses is for Partial Max SAT, where the clauses of Γ are partitioned into soft clauses and hard clauses. Soft clauses may be falsiﬁed and have weight 1; hard clauses may not be falsiﬁed and have weight . So Partial Max SAT is the problem of ﬁnding an assignment that satisﬁes all the hard clauses and minimizes the number of falsiﬁed soft clauses. In Weighted Partial Max SAT, the soft clauses may have any (ﬁnite) weight 1. Weighted Partial Max SAT is the problem of ﬁnding an assignment that satisﬁes all the hard clauses and minimizes the sum of the weights of falsiﬁed soft clauses. The Max SAT resolution calculus is a sound and complete calculus for Max SAT based on resolution. This system was ﬁrst deﬁned by (Larrosa and Heras 2005), and proven complete by (Bonet, Levy, and Many a 2007). A similar calculus can also be deﬁned for Partial Max SAT and Weighted Partial Max SAT. Like classical resolution, (Weighted) (Partial) Max SAT resolution is based on a unique inference rule. In classical resolution, every application of the resolution rule adds a new clause to the system. The inference rule for (Weighted) (Partial) Max SAT, however, replaces two clauses by a different set of clauses. In other words, a clause may be used only once as a hypothesis of a (Weighted) (Partial) Max SAT resolution inference. The inference rule for clauses with ﬁnite weights is:

(x A, w1) (x B, w2) (A B, min(w1, w2)) (x A, w1 min(w1, w2)) (x B, w2 min(w1, w2)) (x A B, min(w1, w2)) (x A B, min(w1, w2))

The notation x A B, where A = a1 as and B = b1 bt with t > 0, is the abbreviation of the set of clauses

x a1 as b1 x a1 as b1 b2 x a1 as b1 bt 1 bt

When t = 0, B is the constant true, so x A B denotes the empty set of clauses. x A B is deﬁned similarly. In the rule, conclusion clauses with weight 0 are omitted; e.g., at least one of the second or third conclusions is omitted; both are omitted if w1 = w2. If one or both weights are , the following rules apply

(x A, w) (x B, ) (A B, w) (x A B, w) (x B, )

(x A, ) (x B, ) (A B, ) (x A, ) (x B, )

for ﬁnite w. The second rule is just the ordinary resolution inference, as the premises are still available as conclusions. After applying the rule, we remove tautologies, and collapse repeated occurrences of variables in clauses. As noted,

for Max SAT inferences the premises are replaced with the conclusions. Note that these inferences depend on the orderings of the literals a1, . . . , as and the literals b1, . . . , bt. This means that, in general, there are multiple ways to apply the rule to a given pair of clauses. It is easy to check that if a truth assignment τ falsiﬁes the formula x A B, then it falsiﬁes exactly one of the clauses in (2), and similarly for x A B. Also, if τ makes one of the premises of (1) with weight w false, then the sum of the weights of the falsiﬁed conclusions is w. Likewise, if τ satisﬁes both premises of (1), then it satisﬁes all the conclusions. Similar considerations apply to inferences on clauses with weight . The soundness of the Weighted Max SAT rule follows immediately. A (Weighted) (Partial) Max SAT refutation starts with a multiset Γ of clauses. After each inference, the multiset of clauses is updated by removing the rule s premises and adding its conclusions. The Max SAT refutation ends with a multiset containing k > 0 occurrences of the empty clause , possibly with weights. The rules give a sound and complete system for Weighted Partial Max SAT (Bonet, Levy, and Many a 2007). Given a set Γ of weighted clauses and a truth assignment τ, the cost of τ is the sum of weights of the clauses that τ falsiﬁes; the cost is inﬁnite if some hard clause is falsiﬁed. Soundness means that if there is a derivation from Γ of empty clauses with weights summing up to k, then there is no assignment of cost < k. Completeness means that if k is the minimum cost of an assignment for Γ, then there is a derivation from Γ of empty clauses with weights adding up to k. It is useful to also have the following two rules when dealing with soft clauses with weights bigger than 1.

(A, w1+w2) Extraction: (A, w1) (A, w2)

(A, w1) (A, w2) Contraction: (A, w1+w2) The Max SAT system is unusual in that its rules have multiple conclusions. This can have unexpected consequences. For example, one might expect that since soft clauses cannot be reused, this means that the portion of a Max SAT refutation that uses soft clauses is tree-like. This is not true however, because an inference may have multiple soft clauses among its conclusions, which can be used at different points in the refutation.

Dual Rail Max SAT. We now deﬁne the dual rail Max SAT system (Ignatiev, Morgado, and Marques-Silva 2017) for refuting a set of clauses Γ. The dual rail Max SAT system is based on Max SAT resolution, but as already mentioned is strictly stronger than resolution. Let Γ be a set of clauses (viewed as hard clauses) over the variables {x1, . . . , xs}. The dual rail encoding Γdr of Γ, uses 2s variables n1, . . . , ns and p1, . . . , ps in place of the s variables xi. The intent is that pi is true if xi is true, and that ni is true if xi is false. The dual rail encoding Cdr of a clause C is deﬁned by replacing each (unnegated) variable xi in C with ni, and replacing each (negated) literal xi

in C with pi. For example, if C is {x1, x3, x4}, then Cdr is {n1, p3, n4}. Note that every literal in Cdr is negated. The dual rail encoding Γdr of Γ contains the following clauses: (1) the hard clause Cdr for each C Γ; (2) the hard clauses pi ni for 1 i s; and (3) the soft clauses pi and ni for 1 i s. Note that all clauses of Γ are Horn: the hard clauses contain only negated literals and the soft clauses are unit clauses. A dual rail Max SAT refutation of Γ is deﬁned as a Max SAT derivation of a multiset of clauses containing s+1 many copies of the empty clause from Γdr. This is based on the fact that Γ is satisﬁable if and only if there is a truth assignment τ which makes all the hard clauses of Γdr true, and only s of the soft clauses false (Ignatiev, Morgado, and Marques-Silva 2017). Γdr is equivalently represented as a set of weighted clauses: (Cdr, ) for C Γ (pi ni, ) for 1 i s (pi, 1) for 1 i s (ni, 1) for 1 i s. More generally, given a set of ﬁnite positive weights w1, . . . , ws, the weighted dual rail encoding Γwdr of Γ is deﬁned as the set of clauses (Cdr, ) for C Γ (pi ni, ) for 1 i s (pi, wi) for 1 i s (ni, wi) for 1 i s. Letting k =

i wi, a weighted dual rail Max SAT refutation is a Max SAT derivation of a set of empty clauses with total weight k+1, from Γwdr. Note that each choice of weights w1, . . . , ws gives a different weighted dual rail encoding. What is important for the refutations is that the weights are chosen sufﬁciently large: any extra weight can be handled by using the fact that an empty clause can be derived from the hard clause pi ni and the two soft clauses pi and ni. When the weights wi are all small (i.e., polynomially bounded), then it is convenient to work with the multiple dual rail Max SAT system. In this system, instead of including the clauses (pi, wi) and (ni, wi) with weights wi possibly larger than 1, we introduce wi many copies of the soft clauses pi and ni, each of weight 1. The resulting set of clauses is denoted by Γmdr. Any Max SAT derivation from Γmdr is readily converted into a Max SAT derivation from Γwdr. Conversely, if there is polynomial upper bound on the values wi, then the size of a Max SAT derivation from Γwdr can be converted into a Max SAT derivation from Γmdr with size only polynomially bigger. This means that the weighted dual rail Max SAT system is a strengthening of the multiple dual rail Max SAT system. For the present paper, the main advantage of working with the multiple dual rail Max SAT system instead of with the weighted dual rail Max SAT system is that it simpliﬁes notation for the proof of Theorem 1 by letting us discuss soft and hard clauses without explicitly writing their weights.

An example. We present a very simple example of a DRMax SAT refutation which refutes the three clauses x1 x2,

x1 and x2. This is almost the simplest possible example, but still reveals interesting aspects. The dual rail encoding has the ﬁve hard clauses

p1 n2 n1 p2 p1 n1 p2 n2,

plus the four soft unit clauses

p1 n1 p2 n2.

Since there are two variables, a DRMax SAT refutation must derive a multiset containing three copies of the empty clause . The following four inferences will be used to form the refutation (the weights 1 and are used for soft and hard clauses, respectively):

(n1, 1) (n1, )

( , 1) (n1, )

(p2, 1) (p2, )

( , 1) (p2, )

(p1, 1) (p1 n2, )

(n2, 1) (p1 n2, 1) (p1 n2, )

(n2, 1) (n2, 1)

We describe a DRMax SAT refutation using these four inferences; its lines consist of ﬁve multisets of clauses Γ0, Γ1, Γ2, Γ3, Γ4. The initial multiset Γ0 contains the nine clauses given above. Since the set of hard clauses never changes, each Γi has the form Γi = Si H where H is the set of ﬁve hard clauses above, and Si is a multiset of soft (weight 1) clauses. Namely,

S0 = {p1, n1, p2, n2} S1 = {p1, , p2, n2} S2 = {p1, , , n2} S3 = {n2, p1 n2, , , n2} S3 = { , p1 n2, , }.

Here S0 is the four initial soft clauses; and S3 contains three copies of as needed for a valid DRMax SAT refutation. There is a couple interesting observations about even such a simple derivation. First, it splits neatly into three independent parts: one that uses n1 and n1 to derive , one that uses p2 and p2 to derive , and one that uses the other clauses to derive a third copy of . This splitting is part of the reason that DRMax SAT can give simpler proofs than ordinary resolution, say for PHP. Second, there is an extra soft clause p1 n2 that is derived but not used; this is a common feature of DRMax SAT refutations.

The Parity principle and Dual Rail encoding of Doubled Pigeonhole principles. The present paper uses (unweighted) dual rail encodings of two combinatorial principles. The ﬁrst is the Parity Principle, expressing a kind of mod 2 counting, which states that no graph on 2m + 1 nodes consists of a complete perfect matching (Ajtai 1990; Beame et al. 1996; Beame and Pitassi 1996). The propositional version of the Parity Principle, for m 1, uses

2m+1 2 variables xi,j, where i = j and xi,j is identiﬁed with xj,i. The intuitive meaning of xi,j is that there is an edge between vertex i and vertex j. The Parity Principle, Parity2m+1, has the following sets of clauses: j =i xi,j for i [2m+1]

xi,j xk,j for i, j, k distinct members of [2m+1]. These clauses state that each vertex has degree one. The second combinatorial principle is the Doubled Pigeonhole Principle, also called the Two Pigeons Per Hole Principle , which states that if 2m+1 pigeons are mapped to m holes then some hole contains at least three pigeons (Biere 2013b). This is encoded with the following clauses 2PHP2m+1 m : m j=1 xi,j for i [2m+1]

xi,j xk,j xℓ,j for distinct i, k, ℓ [2m+1].

The dual rail encoding, (2PHP2m+1)dr, of 2PHP2m+1 m contains the hard clauses m j=1 ni,j for i [2m+1]

pi,j pk,j pℓ,j for j [m] and distinct i, k, ℓ [2m+1]. The soft clauses are the unit clauses ni,j and pi,j for all i [2m+1] and j [m]. There are (2m+1)m positive variables pi,j and likewise (2m+1)m negative variables ni,j, for a total of 2(2m+1)m many soft clauses. A dual rail Max SAT refutation for 2PHP2m+1 m must produce (2m+1)m + 1 many empty clauses ( s) from (2PHP2m+1)dr.

AC0-Frege and Cutting Planes proof systems. To be able to compare dual rail Max SAT with resolution, AC0Frege and Cutting Planes, we need the following terminology. Proof length is measured in terms of the total number of symbols appearing in the proof. A proof system P is said to simulate another proof system Q provided that there is a polynomial p(n) so that any Q proof of size N can be transformed (by a polynomial time construction) into a P-proof of size p(N) of the same formula. For more information on proof complexity, see e.g. the surveys (Buss 2012; Pudl ak 1999). A Frege system is a textbook-style proof system, usually deﬁned to have modus ponens as its only rule of inference (Cook and Reckhow 1979). For convenience in deﬁning the depth of formulas, we can treat an implication A B as being an abbreviate for A B. The depth of propositional formula is measured in terms of alternations: assume a formula ϕ uses only the connectives , and . Using de Morgan s rules, there is a canonical transformation of ϕ into a formula ϕ in negation normal form , i.e., with negations applied only to variables. Viewing ϕ as a tree, the depth of ϕ is the maximum number of blocks of adjacent s and adjacent s along any branch in the tree ϕ . A depth d Frege proof is a Frege proof in which every formula has depth d. An AC0-Frege proof is a proof with a constant upper bound on the depth of formulas appearing in the proof.

The cutting planes system is a pseudo-Boolean propositional proof system. It uses variables xi which take on 0/1 values, indicating Boolean values False and True. The lines of a cutting planes proof are inequalities of the form a1x1 + a2x2 + + anxn an+1, where the ai s are integers. Logical axioms include xi 0 and xi 1; inference rules include addition, multiplication by a integer, and a special division rule. A cutting planes proof refuting a set Γ of clauses has axioms expressing the truth of the clauses in Γ, and has 0 1 as its last line. The cutting planes system CP uses integers ai written in binary; the system CP uses the integers ai written in unary notation. The size of a CP or CP

proof is the total number of symbols in the proof, including the bits used for representing the values of the coefﬁcients ai. For more on cutting planes, see e.g. (Pudl ak 1997; Buss and Clote 1996).

Simulations of Resolution Tree-like resolution Theorem 1. Multiple dual rail Max SAT simulates tree-like resolution. We start with a useful observation. The dual rail encodings include soft unit clauses pi and ni and hard clauses pi ni. Applying a Max SAT inference to pi and pi ni yields the two soft clauses ni and pi ni. Combining ni and ni with a Max SAT inference yields the clause . Thus, we have used up the soft clauses pi and ni and obtained one instance of plus the clause pi ni. As shown next, the soft clause pi ni will let Max SAT simulate a resolution step. Proof. (Sketch) Let R be a tree-like refutation of Γ over the variables x1, . . . , xn. Let ki be the number of times that xi is resolved on in R. We form Γmdr by adding the soft clauses pi and ni with multiplicity ki, and the hard clauses pi ni. (This is permitted as the values ki correspond to the weights wi of a weighted DRMax SAT refutation.) Set K = i ki. By the above observation, from these clauses there is a Max SAT derivation of K many instances of , plus the clauses pi ni with multiplicity ki. We modify the derivation R. For each clause A in Γ, let Adr be the result of replacing members xi with ni, and members xi with pi. An inference in Γ resolving xi A and xi B to obtain A B becomes

ni Adr pi Bdr

Adr Bdr To make this a valid Max SAT inference, ﬁrst resolve ni Adr against an available soft clause pi ni to obtain the soft clause pi A plus some additional clauses. A further Max SAT inference resolves this against pi Bdr to obtain Adr Bdr plus some additional clauses. Continuing this process yields a valid Max SAT refutation of dr, i.e. of . This gives a total of K + 1 clauses as desired.

Note the proof works as long as ki is greater than or equal to the number of times xi is resolved on. For applications, this means it is only needed to have an upper bound on the number of resolutions on xi; for instance, taking ki equal to the total number of inferences in R certainly works.

General resolution Theorem 2. Weighted dual rail Max SAT simulates general resolution. Proof. (Sketch) Let R be a resolution refutation of Γ containing clauses C1, . . . , Cm. Each Ci is either an initial clause from Γ or is derived from two clauses Cj1 and Cj2, where j1 < j2 < i. We deﬁne a directed graph G = ([m], E) encoding the dependencies in the derivation. The set of vertices of G is {1, . . . , m} corresponding to the m clauses of R. The edges are based on inference rules; E is the set of directed edges (j, i) such that Cj is a hypothesis of the resolution inference introducing Ci. Thus, the vertex m (corresponding to Cm) is a sink of G. The sources in G correspond to initial clauses in Γ. All other vertices in G have in-degree two. Since R is not assumed to be tree-like, the out-degrees can be greater than one. We must assign to each clause Ci R a weight wi N. These weights give the weights ki needed for the soft clauses ni and pi when we construct a weighted dual rail Max SAT refutation of Γ. The last (m-th) clause is the ﬁnal derived for the Max SAT refutation: this clause has weight one, wm = 1. For all j < m, deﬁne

(j,i) E wi.

This is the same as deﬁning wj to be the sum of the weights of the clauses which are inferred directly from Cj. Recall the Fibonacci numbers F1 = F2 = 1 and Fi = Fi 1 + Fi 2 for i > 2. The next lemma depends only on the fact that G = ([m], E) has indegree 0 or 2 at every node, and that the directed edges respect the usual ordering of [m]. Lemma 3. wi Fm+1 i. Thus wi < φm/

5 where φ is the golden ratio. (The proof of the lemma is simple and is omitted here for space reasons.) To ﬁnish the proof of Theorem 2, we also need to ﬁx weights ki for the variables xi. Set ki to be equal (or be greater than) the sum of the weights wj of clauses Cj which are introduced by a resolution on xi. By Lemma 3, ki m 2 i=1 φi < φm 1, so ki = 2m is always sufﬁcient. Now Theorem 2 can be proved with the essentially the same construction as Theorem 1. A clause Cℓin R becomes the weighted clause (Cdr ℓ, wℓ) in Rwdr. If Cℓis equal to A B and is derived from xi A and xi B, then in Rwdr, it becomes the (not-yet-valid) inference

(ni Adr, wℓ) (pi Bdr, wℓ)

(Adr Bdr, wℓ) (3)

Note the weights of all three clauses are equal to wℓ. As described below, this is arranged for the two hypotheses by earlier extraction inferences. In Rwdr, the inference (3) is replaced by two Max SAT resolution inferences which resolve against the weighted soft clauses (ni, wℓ) and (pi, wℓ) and the hard clauses (ni pi, ). Rwdr needs inferences to ﬁx up the weights. For i n, let Cℓ1, . . . , Cℓs be the clauses which are inferred by resolving on xi, so ki

j wℓj. At the start of Rwdr, from the initial soft clauses (ni, ki) and (pi, ki), extraction rules are used to derive all the clauses (ni, wℓj) and (pi, wℓj). Similarly,

let Cℓ1, . . . , Cℓs now denote clauses which are derived by resolution using Cℓ, so wℓ= j wℓj. Extraction inferences are used to derive all of the clauses (Cℓ, wℓj) from (Cℓ, wℓ). These clauses are used as hypotheses of later inferences similarly as was done for (3).

AC0-Frege+PHP simulates dual rail Max SAT This section proves that constant depth Frege augmented with the schematic pigeonhole principle PHPn+1 n can polynomially simulate the dual rail Max SAT proof system. Theorem 4. AC0-Frege+PHP p-simulates the dual rail Max SAT system. More precisely, there is a constant d0 and a polynomial p(s) so that the following holds. If Γ is a set of clauses and Γdr has a Max SAT refutation of size s, then Γ has a depth d0 Frege refutation from instances of the PHPn+1 n of size p(s). The value of d0 depends on the exact deﬁnitions of the Frege system (e.g., with modus ponens, or with the sequent calculus, etc.) and of depth; however, d0 is small, approximately equal to 3. In particular, the Frege proof uses instances of PHP which are obtained by substituting depth one formulas (either conjunctions or disjunctions of literals) for the variables zi,j of a pigeonhole formula. It is open whether the theorem holds for the dual rail Max SAT system generalized to allow arbitrary (binaryencoded) weights. Corollary 5. Max SAT refutations of the dual rail encoded Parity Principle require exponential size 2nϵ for some ϵ > 0. Corollary 5 follows from Theorem 4 since (Beame and Pitassi 1996) and (Riis 1993), building on (Ajtai 1990), showed that AC0-Frege+PHP refutations of Parityn require size 2nϵ for some ϵ > 0. Corollary 6. The dual rail Max SAT proof system does not polynomially simulate CP or even CP . Corollary 6 follows from Corollary 5 since it is easy to give polynomial size CP proofs of the parity principle.

Proof. We now prove Theorem 4. Let Γ be an unsatisﬁable set of clauses in the variables x1, . . . , x N. Its dual rail encoding Γdr uses the variables ni and pi for i [N]. By hypothesis, there is a Max SAT derivation D of N + 1 many empty clauses from Γdr. Our goal is to give a AC0-Frege+PHP refutation of Γ; this refutation involves only the variables xi. The intuition for forming the AC0-Frege proof is that we assume that Γ is satisﬁed by x1, . . . , x N, and use the refutation D to deﬁne a contradiction to the pigeonhole principle. The Max SAT refutation D has size s and contains m < s inferences. The j-th inference of D has the form

l A l B A B l A B l A B (4)

for l a literal. Here, l A B and l A B denote a sets of zero or more clauses, which depend on orderings of the literals in A and in B. Let Dj be the multiset of clauses which are available for use in D after the j-th inference. Thus, D0 is the same as Γdr. The multiset Dj+1 is obtained from Dj by removing

the hypotheses of the j-th inference (4) and adding its conclusions. Since D is a valid Max SAT refutation, the ﬁnal set Dm contains N + 1 many empty clauses . Two extra sets D 1 and Dm+1 are deﬁned by letting D 1 contain the N clauses x1, . . . , x N and letting Dm+1 be the multiset containing N + 1 copies of the empty clause . Let D denote the disjoint union of the multisets Dj for 1 j m+1. Members of the multiset D are denoted (C, j) indicating that C is a member of Dj. If there are multiple occurrences of C in Dj, then there are multiple occurrences of (C, j) in D . We will assume that multiple occurrences are correctly tracked with each C labelled as to which occurrence it is, but suppress this in the notation. Let S be the cardinality of D , so S = s O(1). Deﬁne

T = 0 i m+1 Di and U = 1 i m Di.

We have |T| = S N and |U| = S N 1, so |T| = |U|+1. We wish to deﬁne a total and injective function f : T U, based on the assumption that x1, . . . , x N specify a satisfying assignment for Γ: this will contradict the pigeonhole principle. For this, it is necessary to deﬁne formulas Pα,β for each α = (C, j) T and each β = (C , j ) U which deﬁne the condition that f(α) = β. These formulas Pα,β will involve the variables x1, . . . , x N. If (C, j) U, then C is a clause (possibly empty) involving only the variables ni and pi. We wish to identify pi and ni with xi and xi to evaluate the truth of C. Accordingly, deﬁne X(C) to the the formula obtained by replacing the literals pi and ni with xi, and the literals pi and ni with xi. If C contains both pi and ni (or both pi and n1) for some i, then X(C) becomes a tautologous clause and can be treated as the constant . Each C is a ﬁxed clause in D, thus each X(C) is also a ﬁxed clause. We next give the deﬁnition of the function f and deﬁne the formulas Pα,β. Let α be (C, j) and β be (C , j ). The intuition is that if C is true, then f(α) = α; and if C is false then f(α) = β exactly when j = j 1 and C is the false formula in Dj which corresponds to C under the application of the j-th inference of D. More formally: 1. Suppose j = m + 1, so C is an empty clause in the extra set Dm+1. We arbitrarily order the members of Dm+1 and Dm. Suppose C is the ℓ-th member of Dm+1. We wish to assign f(α) to equal the ℓ-th in Dm. Accordingly, Pα,β is the constant (true) if and only if j = j 1 = m and C is the ℓ-th in Dm. Otherwise, Pα,β is the constant (false). 2. Suppose j 1, and that C, as a member of Dj, is not a clause in the conclusion of the j-th inference (4). The idea is that if C is true, then f(α) = α, and if C is false, then f(α) = β provided j = j 1 and C is the same formula as C, namely the occurrence of the clause in Dj 1 which corresponds to C. More formally, Pα,α is the formula X(C). And, if j = j 1 and C Dj 1 is the corresponding occurrence of the clause C in Dj 1, then Pα,β is the formula X(C). In all other cases, Pα,β is . 3. Suppose j 1, and C is one of the conclusions of the j-th inference (4). The idea is that if C is true, then f(α) = α, and if C is false, then f(α) = β provided

j = j 1 and C is the false hypothesis of (4). More formally, Pα,α is the formula X(C). And, if j = j 1 and C Dj 1 is one of the hypotheses of (4), then Pα,β is the formula X(C) X(C ), which is a conjunction of literals. (This can make Pα,β false by virtue of containing both ℓand ℓ.) In all other cases, Pα,β is . 4. Suppose j = 0 and C is a hard clause of Γdr in D0. Assuming Γ is satisﬁed by x1, . . . , x N, C is true; the idea is that f(α) = α. Accordingly, Pα,α is the clause X(C). For all other β, Pα,β is . 5. Finally suppose j = 0 and C is a soft unit clause in Γdr, i.e. either pi or ni. The intuition is again that f(α) = α if C is true. Otherwise f(α) = (xi, 1). Formally, Pα,α is X(C). And, for β = (xi, 1), Pα,β is X(C). For all other β, Pα,β is . The formulas Pα,β are linear size and depth one, either conjunctions or disjunctions of literals. We must argue there are constant depth Frege proofs of the injectivity conditions

Pα,β Pα ,β for all α = α T and all β

and of the totality conditions β U Pα,β for all α T.

The injectivity conditions are easy to check since so many Pα,β s are the constant . First, suppose that α = (C, j) and α = (C , j) where C and C are two of the conclusions of the j-th inference (4). By inspection, C and C contain a clashing literal; thus they cannot both be false. It follows that at least one of Pα,β or Pα ,β is false. A similar, even simpler, argument works when α = (pi, 0) and α = (ni, 0). The injectivity conditions for all other α, α , β are trivial. There are only a couple non-trivial cases to check for the provability of the totality conditions. The ﬁrst case is when α = (C, j) is the conclusion of the j-th inference (4). For this, we must argue that if X(C) is false, then (4) has a hypothesis C that has X(C ) false. This is completely trivial to prove with a constant depth Frege proof, since either (a) one of the hypotheses is a subclause C of C so X(C ) is a subclause of X(C) and thus X(C ) is false, or (b) C is A B in (4) and since X(ℓ) is either false or true and C can be taken to be the ﬁrst or second hypothesis (respectively). The second non-trivial case to check for totality is the case where α = (C, 0) with C one of the hard clauses in Γdr. In this case, Pα,α holds only if X(C) is true. However, X(C) is a member of Γ, and hence X(C) holds under the assumption that x1, . . . , x N satisfy the clauses of Γ. The above obtained a contradiction to the pigeonhole principle from the assumption that the clauses of Γ are true. The argument can be formalized in constant depth Frege; hence AC0-Frege+PHP refutes Γ. By construction, the AC0-Frege+PHP refutation is polynomial size in s.

Upper bound for the doubled PHP This section discusses the doubled pigeonhole principle which states that if 2m + 1 pigeons are mapped to m holes then some hole contains at least three pigeons (Biere 2013b).

Theorem 7. There are polynomial size Max SAT refutations of the dual rail encoding of the 2PHP2m+1 m clauses. Proof sketch. The Max SAT refutation ﬁrst derives 2m+1 clauses , one for each pigeon i [2m+1], by resolving the hard clause m j=1 ni,j against the soft unit clauses ni,j to obtain the clause . These inferences derive other clauses as well, but they are not needed for the refutation, so we just ignore them. The remainder of the Max SAT refutation is more complex and derives 2m 1 empty clauses for each hole j [m]. This gives a total of (2m 1)m additional s and, since 2m+1+(2m 1)m is equal to (2m+1)m+1, sufﬁces to complete the Max SAT refutation. Fix a hole j. We inductively construct Max SAT derivations of 2m 1 empty clauses from the clauses involving literals pi,j. The construction is to be repeated (independently) for each j [m]. The general idea is to derive I 2 many s from the ﬁrst I pigeons, namely using only the literals pi,j for i I. The construction proceeds in stages, one for each value I = 3, 4, . . . , 2m+1. The proof is involved, and we omit it from this version for lack of space.

Experiments. Two sets of doubled pigeonhole formulas were considered encoding At Most2 constraints by (1) triplewise encoding as studied earlier in the paper and (2) sequential counters (Sinz 2005), i.e. with the use of auxiliary variables (Tseitin 1968). The latter set contains PHP2m+1 m formulas for m {5, . . . , 100} while the largest triplewiseencoded formula is constructed for m = 25 (due to the formula s growth as m3 if triplewise-encoded). Our evaluation targets SAT and Max SAT solvers. We tested two CDCL SAT solvers: Glucose 3 (Audemard, Lagniez, and Simon 2013) and lingeling2 (Biere 2013a; 2014). Also, the Max SAT solvers used are: Max HS (Davies and Bacchus 2011; 2013a; 2013b), LMHS (Saikko, Berg, and J arvisalo 2016), Eva500a (Narodytska and Bacchus 2014), Open WBO16 (Martins, Manquinho, and Lynce 2014), and MSCG (Morgado, Ignatiev, and Marques-Silva 2015). Figure 1 depicts the performance of the considered competitors. As expected, SAT solvers can only deal with PHP2m+1 m for m 7 given 1800s timeout, while Max SAT solvers do not perform much better being able to deal with m 15. This can be attributed to the clauses of the dual rail encoding, more precisely to clauses (pi ni, ), (pi, 1) and (ni, 1) introduced for every variable xi of the original CNF formula. Obviously, these clauses comprise unsatisﬁable cores of the dual rail Max SAT formula and these cores are known to potentially confuse a Max SAT solver and, thus, be harmful, as observed in earlier work (Ignatiev, Morgado, and Marques-Silva 2017). Indeed, our results conﬁrm this conjecture as the performance of all Max SAT solvers gets tremendously increased when clauses (pi ni, ) are discarded3 (the corresponding conﬁgurations of Max SAT solvers in Figure 1 are marked with additional symbol ).

2Cardinality reasoning (Biere 2013a), which helps lingeling deal with PHP formulas, was disabled. 3Note that the short proof for 2PHP2m+1 m provided in the previous section does not use clauses (pi ni, ).

0 20 40 60 80 100 instances

CPU time (s)

LMHS Glucose Open WBO16 lingeling MSCG Eva500a Max HS

Figure 1: Performance of SAT and Max SAT solvers on doubled pigeonhole formulas.

In particular, Max HS, LMHS, as well as MSCG can solve all the considered instances (for m up to 100) with the harmful clauses being discarded while Eva500a and Open WBO16 are a few instances behind. This suggests that removing these clauses enables Max SAT solvers to produce a short proof when dealing with 2PHP2m+1 m .

Conclusions This paper investigates the relative efﬁciency of the DRMax SAT proof system (Ignatiev, Morgado, and Marques Silva 2017). The paper shows that DRMax SAT p-simulates general resolution. Given as earlier result of polynomial time refutations of PHP formulas (Ignatiev, Morgado, and Marques-Silva 2017), we conclude that DRMax SAT is a stronger proof system than either general resolution or conﬂict-driven clause learning. The paper also compares DRMax SAT with AC0-Frege+PHP, and proves that AC0Frege+PHP p-simulates DRMax SAT. Moreover, the paper also proves that DRMax SAT does not p-simulate AC0Frege+PHP or the cutting plane based proof systems CP and CP . Finally, the paper investigates the formulas encoding the doubled PHP principle, and derives polynomial size refutations with the DRMax SAT proof system. The results in this paper motivate a number of research lines. The ﬁrst is to understand whether CP p-simulates DRMax SAT. Another research line is to investigate whether the weighted version is stronger than plain DRMax SAT. One line of research is to extend the results in this paper to the case of core-guided algorithms (Ignatiev, Morgado, and Marques-Silva 2017).

References Ajtai, M. 1990. Parity and the pigeonhole principle. In Feasible Mathematics, 1 24. Birkh auser. Audemard, G.; Katsirelos, G.; and Simon, L. 2010. A

restriction of extended resolution for clause learning SAT solvers. In AAAI. Audemard, G.; Lagniez, J.; and Simon, L. 2013. Improving Glucose for incremental SAT solving with assumptions: Application to MUS extraction. In SAT, 309 317. Beame, P., and Pitassi, T. 1996. An exponential separation between the parity principle and the pigeonhole principle. Annals of Pure and Applied Logic 80(3):195 228. Beame, P., and Pitassi, T. 2001. Propositional proof complexity: Past, present, and future. In Current Trends in Theoretical Computer Science. 42 70. Beame, P.; Impagliazzo, R.; Kraj ıˇcek, J.; Pitassi, T.; and Pudl ak, P. 1996. Lower bounds on Hilbert s Nullstellensatz and propositional proofs. Proceedings of the London Mathematical Society 73(3):1 26. Beame, P.; Kautz, H. A.; and Sabharwal, A. 2004. Towards understanding and harnessing the potential of clause learning. J. Artif. Intell. Res. 22:319 351. Biere, A. 2013a. Lingeling, plingeling and treengeling entering the SAT competition 2013. In Proceedings of SAT Competition 2013, 51 52. Biere, A. 2013b. Two pigeons per hole problem. In Proceedings of SAT Competition 2013, 103 103. Biere, A. 2014. Lingeling essentials, A tutorial on design and implementation aspects of the SAT solver lingeling. In Pragmatics of SAT workshop, 88. Bonet, M. L.; Levy, J.; and Many a, F. 2007. Resolution for Max-SAT. Artif. Intell. 171(8-9):606 618. Bonet, M. L.; Pitassi, T.; and Raz, R. 2000. On interpolation and automatization for frege systems. SIAM J. Comput. 29(6):1939 1967. Bryant, R. E.; Beatty, D. L.; Brace, K. S.; Cho, K.; and Shefﬂer, T. J. 1987. COSMOS: A compiled simulator for MOS circuits. In DAC, 9 16. Buss, S. R., and Clote, P. 1996. Cutting planes, connectivity and threshold logic. Archive for Mathematical Logic 35:33 62. Buss, S. R. 2012. Towards NP-P via proof complexity and proof search. Annals of Pure and Applied Logic 163(9):1163 1182. Cook, S. A., and Reckhow, R. A. 1979. The relative efﬁciency of propositional proof systems. J. Symb. Log. 44(1):36 50. Davies, J., and Bacchus, F. 2011. Solving MAXSAT by solving a sequence of simpler SAT instances. In CP, 225 239. Davies, J., and Bacchus, F. 2013a. Exploiting the power of mip solvers in maxsat. In SAT, 166 181. Davies, J., and Bacchus, F. 2013b. Postponing optimization to speed up MAXSAT solving. In CP, 247 262. Haken, A. 1985. The intractability of resolution. Theor. Comput. Sci. 39:297 308. Huang, J. 2010. Extended clause learning. Artif. Intell. 174(15):1277 1284.

Ignatiev, A.; Morgado, A.; and Marques-Silva, J. 2017. On tackling the limits of resolution in SAT solving. In SAT, 164 183. Larrosa, J., and Heras, F. 2005. Resolution in Max-SAT and its relation to local consistency in weighted CSPs. In IJCAI, 193 198. Martins, R.; Manquinho, V. M.; and Lynce, I. 2014. Open WBO: A modular Max SAT solver,. In SAT, 438 445. Morgado, A.; Heras, F.; Lifﬁton, M. H.; Planes, J.; and Marques-Silva, J. 2013. Iterative and core-guided Max SAT solving: A survey and assessment. Constraints 18(4):478 534. Morgado, A.; Ignatiev, A.; and Marques-Silva, J. 2015. MSCG: Robust core-guided Max SAT solving. JSAT 9:129 134. Narodytska, N., and Bacchus, F. 2014. Maximum satisﬁability using core-guided Max SAT resolution. In AAAI, 2717 2723. Nordstr om, J. 2015. On the interplay between proof complexity and SAT solving. SIGLOG News 2(3):19 44. Palopoli, L.; Pirri, F.; and Pizzuti, C. 1999. Algorithms for selective enumeration of prime implicants. Artif. Intell. 111(1-2):41 72. Pudl ak, P. 1997. Lower bounds for resolution and cutting planes proofs and monotone computations. Journal of Symbolic Logic 62:981 998. Pudl ak, P. 1999. On the complexity of propositional calculus, Sets and proofs. In Logic Colloquium 97, 197 218. Cambridge University Press. Riis, S. 1993. Independence in Bounded Arithmetic. Ph.D. Dissertation, Oxford University. Sabharwal, A. 2005. Algorithmic applications of propositional proof complexity. Ph.D. Dissertation, University of Washington. Saikko, P.; Berg, J.; and J arvisalo, M. 2016. LMHS: A SATIP hybrid Max SAT solver. In SAT, 539 546. Sinz, C. 2005. Towards an optimal CNF encoding of Boolean cardinality constraints. In CP, 827 831. Tseitin, G. S. 1968. On the complexity of derivation in propositional calculus. Studies in constructive mathematics and mathematical logic 2(115-125):10 13.