# projection_inference_and_consistency__34eecf5e.pdf Projection, Inference, and Consistency J. N. Hooker Carnegie Mellon University, Pittsburgh, USA jh38@andrew.cmu.edu Projection can be seem as a unifying concept that underlies inference in logic and consistency maintenance in constraint programming. This perspective allows one to import projection methods into both areas, resulting in deeper insight as well as faster solution methods. We show that inference in propositional logic can be achieved by Benders decomposition, an optimization method based on projection. In constraint programming, viewing consistency maintenance as projection suggests a new but natural concept of consistency that is achieved by projection onto a subset of variables. We show how to solve this combinatorial projection problem for some global constraints frequently used in constraint programming. The resulting projections are useful when propagated through decision diagrams rather than the traditional domain store. 1 Introduction Projection can be seem as a unifying concept that underlies both inference in logic and consistency maintenance in constraint programming. Projection methods that have been developed in other contexts can therefore be harnessed to help solve inference and constraint programming problems. In propositional logic, for example, inference can be conceived as the general problem of deducing all information that can be expressed in terms of a specified subset of variables (atomic propositions). This can also be understood as a projection problem. It can be solved not only by the resolution method, which is closely parallel to a classical projection method for linear inequalities (Fourier-Motzkin elimination), but by Benders decomposition, an optimization method that may seem unrelated to inference. It is well known that the Benders method can compute a projection onto a subset of variables. An extension of the classical method, logic-based Benders decomposition, solves the inference problem for propositional logic. Moreover, it can take advantage of clause learning techniques used in state-of-the-art propositional satisfiability (SAT) solvers. Consistency maintenance is a fundamental tool of constraint programming. Its purpose is to exclude assignments of values to variables that are inconsistent with any feasible solution of a constraint, thereby reducing the amount of search necessary to find a solution. The results of consistency maintenance for one constraint are propagated to other constraints through some kind of data structure, such as a domain store, which consists of the set of possible values for each individual variable. Consistency maintenance is both an inference problem and a projection problem. It is an inference problem because it deduces constraints that variable assignments must satisfy. It is a projection problem because excluding infeasible assignments to a subset of variables is equivalent to computing the projection of the feasible set onto those variables. This unifying perspective can be exploited in constraint programming by addressing consistency maintenance explicitly as a projection problem. Existing types of consistency are already forms of projection, including domain consistency, bounds consistency, and k-consistency. However, viewing consistency in this light suggests a simple type of consistency that has not been investigated. We call it J-consistency, which is achieved by projecting the problem s solution set onto a subset J of variables. Achieving J-consistency can reduce backtracking when the solver propagates through a richer data structure than a domain store. This research program poses a problem that might be called combinatorial projection: projecting a combinatorial constraint, such as the global constraints routinely used in constraint programming, onto a subset of variables. We solve this problem here for a small collection of standard global constraints: among, sequence, regular, and all-different. 2 Inference Inference can be understood as the process of extracting information that relates to a particular question or topic. For example, if S is a constraint set that describes the operation of a factory, we may wish to deduce facts about a certain product P. Let s suppose the constraints in S collectively contain variables x1, . . . , xn, and that x1, . . . , xk are relevant to product P. For example, x1 may be the model of P produced, x2 the output level of P, x2 its unit manufacturing cost, and so forth up to xk. Then we wish to deduce from S all constraints containing x1, . . . , xk. We will see that this is a projection problem. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Table 1: A set of logical clauses. x1 _ x4 _ x5 x1 _ x4 _ x5 x1 _ x5 _ x6 x1 _ x5 _ x6 x2 _ x5 _ x6 x2 _ x5 _ x6 x3 _ x4 _ x5 x3 _ x4 _ x5 2.1 Inference as Projection To make the connection between inference and projection more precise, we standardize terminology as follows. For J {1, . . . , n}, let x J be the tuple of variables in {xj | j 2 J} arranged in increasing order of indices, and similarly for v J. Let Dj be the domain of xj, with D = D1 Dn and DJ = Q j2J Dj. Projection can be defined semantically by saying that a set V 0 DJ of tuples is the projection onto x J of V D when V 0 = {v J | v 2 V }. This can be written V 0 = V |J. However, we are also interested in a syntactic concept that tells us when a constraint set is a projection onto x J of another constraint set. To this end, we define a constraint to be an object that contains certain variables and is either satisfied or violated by any given assignment of values to those variables. An assignment can satisfy or violate a contraint only when it fixes all variables in the constraint. Let DJ(S) be the set of all v 2 DJ for which x J = v satifies S (i.e., satisfies all the constraints in S). We say that S is a constraint set over x when it contains only variables in x = (x1, . . . , xn), perhaps not all. If S is a constraint set over x, then S implies constraint C if an assignment to x satisfies C whenever it satisfies S, or D(S) D({C}). Let S0 and S be constraint sets over x J and x, respectively. We define S0 to be a projection onto x J of S when S0 describes the projection onto x J of S s satisfaction set, or more precisely, DJ(S0) = D(S)|J. It is easy to show that projection captures exactly what S implies about about x J, in the following sense: Lemma 1 Let S and S0 be constraint sets over x and x J, respectively. Then set S0 is a projection of S onto x J if and only if S0 implies all and only constraints over x J that are implied by S. As an example, let S consist of the logical clauses in Table 1. The clause set S0 = {x1 _ x2, x1 _ x3} is a projection of S onto (x1, x2, x3). This means that any clause over (x1, x2, x3) implied by S is implied by S0. The two clauses in S0 capture all that can be inferred in terms of atoms x1, x2, x3. (In fact, they are the prime implicates of S.) 2.2 Inference for Linear Inequalities The classical projection method for a system Ax b of linear inequalities is Fourier-Motzkin elimination. We can compute the projection onto x1, . . . , xk by eliminating variables xn, xn 1, . . . , xk+1 one at a time. Let S initially be the set of inequalities Ax b. Each variable xj is eliminated as follows. For each pair of inequalities in S that have the form c x + c0xj γ and d x d0xj δ, where c0, d0 > 0 and x = (x1, . . . , xj 1), we have or L xj U for short. We therefore add inequality L U to S for each such pair. This done, we remove from S all inequalities that contain xj. The inequalities in S at the end of the procedure describe the projection and therefore capture everything that can be inferred from Ax b in terms of x1, . . . , xk. Fourier-Motzkin elimination can also be used to compute inferences in probability logic, which can formulated as an optimization problem with linear inequality constraints [Boole, 1854; Hailperin, 1976; Nilsson, 1986]. It is more efficient, however, to solve the problem with modern linear programming (LP) methods that use column generation [Hansen and Perron, 2008; Hooker, 1988b; Jaumard et al., 1991; Klinov and Parsia, 2013]. 2.3 Inference in Propositional Logic An elimination procedure based on resolution [Quine, 1952; 1955] solves the projection problem for logical clauses. The procedure is the same as Fourier-Motzkin elimination, except that when eliminating variable xj, it considers pairs of clauses of the form C _ xj and D _ xj, where no one variable occurs negated in C and posited in D (or vice-versa). Each pair generates a resolvent on xj, namely C _ D. Resolution can in fact be seen as a form of Fourier-Motzkin elimination plus rounding [Williams, 1987]. It can be shown [Hooker, 1992b; 2012] that eliminating variables xj for j 62 J by resolution (in any order) yields a projection of S onto x J. Resolution tends to be impractical, but Benders decomposition [Benders, 1962] can be much more efficient, especially when J is small. The classical Benders method applies only to problems with an LP subproblem, but we use logic-based Benders decomposition, which is suitable for general constraint solving and optimization [Hooker, 2000; 2007; 2012; Hooker and Ottosson, 2003]. We apply Benders decomposition to a clause set S as follows. The master problem (initially empty) consists of Benders cuts in the form of clauses over x J. Each iteration of the Benders method begins by checking if the master problem is infeasible, in which case the procedure terminates. Otherwise a solution x J of the master problem is obtained. This defines a subproblem S( x J) that is the result of fixing x J to x J in S. If S( x J) is infeasible, a Benders cut or nogood clause is generated that excludes x J, as well as perhaps other values of x J for which S(x J) is infeasible for similar reasons. If S( x J) is feasible, a clause (enumerative Benders cut) is generated that excludes only x J. In either case, the Benders cut is added to the master problem, and the process repeats. At termination, the nogood clauses in the master problem define the projection of S onto x J [Hooker, 2012]. This procedure can be implemented by a single depthfirst branching algorithm that generates conflict clauses. Let the variables in x J be first in the branching order. When unit propagation detects unsatisfiability at a node of the tree, generate conflict clauses and backtrack (see [Beame et al., 2003] for a survey of these concepts). Subsequent branches must be consistent with the conflict clauses so far generated. When a feasible node is reached, backtrack to the last variable in x J. When enumeration is complete, the conflict clauses over x J define the projection of S onto x J. Because the search backtracks to level |J| when a satisfying solution is found, the algorithm can be practical when J is not too large. Suppose, for example, that we wish to project the clause set in Table 1 onto x J for J = {1, 2, 3}. A branching tree appears in Fig. 1. Upon completion of the search, the set of conflict clauses over (x1, x2, x3) is a projection onto x J, in this case {x1 _ x2, x1 _ x3}. Other adaptations of resolution and Fourier-Motzkin elimination can be used to compute projections for cardinality clauses [Hooker, 1988a], 0 1 linear inequalities [Hooker, 1992a], and general integer linear inequalities [Williams and Hooker, 2014]. 3 Consistency Maintenance A constraint set S over x is domain consistent when for each variable xj and each value v 2 Dj, the assignment xj = v is part of some assignment x = v that satisfies S. This is equivalent to saying that {xj 2 Dj} is a projection of S onto xj, or Dj = D(S)|{j}, for j = 1, . . . , n. Maintaining domain consistency (or an approximation of it) for some individual constraints in S, and propagating the reduced domains through a domain store, tends to reduce the search tree due to smaller domains. Another type of consistency related to backtracking is kconsistency. It is again achieved by projection, but by projecting only subsets of constraints over k variables onto subsets of k 1 variables [Freuder, 1982]. 3.1 J-Consistency We propose a type of consistency that is more directly related to projection and naturally generalizes domain consistency. Let S be J-consistent when some S0 S is a projection of S onto x J. That is, S contains constraints that describe its t ....... ....... ....... ....... ....... ....... ....... .................................................................................... t ....... ....... ....... ....... ....... ...... .......................................... t ......... ........ ........ ......... ....t ....... ....... ....... .......t ....... ....... ....... ..... .................................................... t x1 _x5 t x2 _ x5 x1 _x2 t ......... ........ ......... ......... .... .......................................... t ....... ....... ....... ....... .............................................. t x1 _x4 t x3 _ x4 x1 _x3 t ........................................t .......................................t t ....... ....... ....... ..... .................................................................... t ....... ....... ....... ..... .......................................... t ....... ....... ....... ..t ....... ....... ....... .t t ....... ....... ....... ..t ....... ....... ....... .t t ....... ....... ....... ..... .......................................... t ....... ....... ....... ..t Figure 1: Branching tree for a SAT instance. Dashed arcs indicate xj = F and solid arcs xj = T. Conflict clauses are shown at failure nodes. Solutions are found at remaining leaf nodes, from which the search backtracks to x3. projection onto x J, or DJ(SJ) = D(S)|J. If we view S as containing the in-domain constraints xj 2 Dj, S is domain consistent if and only if it is {j}-consistent for j = 1, . . . , n. If we branch on variables in the order x1, . . . , xn, a natural strategy is to project out variables in reverse order xn, xn 1, . . . until the computational burden becomes excessive. We will see below that for some important global constraints, it is relatively easy to project out some or all of the variables. There is no point in maintaining J-consistency for individual constraints when propagation is through a domain store. However, recent research shows that propagation through relaxed decision diagrams can be substantially more effective than domain propagation [Bergman et al., 2014; 2016; 2011; Cir e and van Hoeve, 2012; Cire and van Hoeve, 2013]. Maintaining J-consistency could have a significant effect on propagation in this context. This is illustrated in [Hooker, 2016]. 3.2 Projection of Among Constraint Projecting out variables in an among constraint [Beldiceanu and Contejean, 1994] is relatively simple because each variable elimination yields another among constraint. If x = (x1, . . . , xn), the constraint among(x, V, , u) requires that at least and at most u of the variables in x take a value in V . Variable xn is projected out as follows. Let + = max{ , 0}, and assume 0 u n, Theorem 2 The projection of among(x, V, , u) onto x = (x1, . . . , xn 1) is among( x, V, 0, u0), where (( 1)+, u 1), if Dn V ( , min{u, n 1}), if Dn \ V = ; (( 1)+, min{u, n 1}), otherwise Variables xn, xn 1, . . . , x1 are projected out sequentially by applying the theorem recursively. The original constraint is feasible if and only if 0 u0 after projecting out all variables. 3.3 Projection of Sequence Constraint Fourier-Motzkin elimination provides a fast and convenient method for projecting a sequence constraint. The constraint has an integrality property that makes a polyhedral projection technique adequate, and Fourier-Motzkin simplifies to the point that a single generalized sequence constraint describes the projection after each variable elimination. We assume without loss of generality that the sequence constraint applies to 0-1 variables x1, . . . , xn [van Hoeve et al., 2006; R egin and Puget, 1997]. It enforces overlapping constraints of the form among((x q+1, . . . , x ), {1}, L , U ) (1) for = q, . . . , n, where L , U are nonnegative integers, and where domain Dj is defined by j xj βj for j, βj 2 {0, 1}. Note that we allow different bounds for different positions in the sequence. The following theorem provides a recursion for eliminating xn, . . . , x1: ..................................................... .......... a ..................................................... ........... c 1 ................................................ .......... a 2 ................................................ .......... c 3 ....... ....... ....... ...... ....... a ............................................................................ .......... 4 ................................................ .......... b 5 ....... ....... ....... ...... ....... b 7 ................................................ .......... b ................................................................. .......... a ....... ....... ....... ...... ....... c 1 ................................................ .......... a 3 ................................................ .......... a ........ ........ ........ ........ ............. ....... j = 1 2 3 4 5 6 7 8 Dj = {a, c} {a, b, c} {a, b} {b, c} {a, c} {a, b} {a, b} j = {a, c} {a, c} {b} {b} {a} {a} {a} Figure 2: State transition graph for a shift scheduling problem instance. Dj is the original domain of xj, and D0 j result of achieving domain consistency. States 3, 4, 5, and 8 are valid terminal states in the automaton. Dashed lines lead to nonterminal states that are infeasible because there are no out-transitions consistent with the given variable domains. Theorem 3 Given any k 2 {0, . . . , n}, the projection of the sequence constraint defined by (1) onto (x1, . . . , xk) is described by a generalized sequence constraint that enforces constraints of the form (xi, . . . , x ), {1}, L where i = q + 1, . . . , for = q, . . . , k and i = 1, . . . , for = 1, . . . , q 1. The projection of the sequence constraint onto (x1, . . . , xk 1) is given by (2) with L i+1 replaced by ˆL k }, i = 1, . . . , q k + , L i, for i = q k + + 1, . . . , q k }, i = 1, . . . , q k + , U i , for i = q k + + 1, . . . , q The worst-case complexity of projecting out each variable xk is O(kq). 3.4 Projection of Regular Constraint The regular constraint [Pesant, 2004] formulates scheduling and related constraints as regular expressions. Projection can be carried out by constructing and truncating the state transition graph for the associated deterministic finite automaton. For example, the regular expression (((aa|aaa)(bb|bbb)) |((cc|ccc)(bb|bbb)) ) ( |(aa|aaa)|(cc|ccc)) represents a shift scheduling problem and generates the transition graph of Fig. 2 over a 7-day period, where a, b, c are shifts and xj is the assigned shift for day j. The graph shows 2 feasible shift assignments: aabbaaa and ccbbaaa. Truncating the graph at stage j = 4 yields a projection onto (x1, x2, x3), which has the two feasible solutions aab and ccb. Details may be found in [Hooker, 2016]. 3.5 Projection of All-different Constraint The constraint alldiff(x) requires that x1, . . . , xn take different values. While domain consistency is relatively easy to achieve for the constraint, using a matching algorithm, general projection is surprisingly complex. The projection onto xk = (x1, . . . , xk) takes the form of a disjunction of constraint sets, each of which consists of an alldiff constraint and a family of atmost constraints. The number of disjuncts can grow quite large in principle, but the disjuncts tend to simplify and/or disappear as variable elimination proceeds, particularly if the domains are small. The projection onto xk is a disjunction of constraint sets, each of which has the form alldiff(xk); atmost(xk, Vi, bi) for i 2 I; xj 2 Dj for j = 1, . . . , k (3) where bi < k for i 2 I. The atmost constraint says that at most bi occurrences of values in Vi appear in xk. When k = n there are no atmost constraints. We also note that atmost(xk, Vi, bi) is redundant if the number of variables in xk whose domains intersect Vi is at most bi, or in particular if k bi. Algorithm 1 is applied to compute the projection onto xn 1, ..., xk sequentially. Theorem 4 Algorithm 1 correctly computes the projection of (3) onto xk 1. Algorithm 1 Given a projection of alldiff(xn) onto xk, compute a projection onto xk 1. The projection onto xk is assumed to be a disjunction of constraint sets, each of which has the form (3). The algorithm is applied to each disjunct, after which the disjunction of all created constraint sets forms the projection onto xk 1. For all i 2 I: if atmost(xk, Vi, bi) is redundant then remove i from I. For all i 2 I: If Dk \ Vi 6= ; then If bi > 1 then Create a constraint set consisting of alldiff(xk 1), atmost(xk 1, Vi0, bi0) for i0 2 I \ {i}, and atmost (xk 1, Vi, bi 1). Let R = Dk \ S i2I Vi. If |R| > 1 then Create a constraint set consisting of alldiff(xk 1), atmost(xk 1, Vi0, bi0) for i0 2 I, and atmost (xk 1, R, |R| 1). Else if |R| = 1 then Let R = {v} and remove v from Dj for j = 1, . . . , k 1 and from Vi for i 2 I. If Dj is nonempty for j = 1, . . . , k 1 then For all i0 2 I: if atmost(xk 1, Vi0, bi0) is redundant then remove i0 from i. Create a constraint set consisting of alldiff(xk 1) and atmost(xk 1, Vi0, bi0) for i0 2 I. As an example, suppose we wish to project alldiff(x5), where the domains D1, . . . , D5 are {a, b, c}, {c, d, e}, {d, e, f}, {e, f, g}, and {a, f, g}, respectively. The projection onto x4 is alldiff(x4), atmost(x4, {a, f, g}, 2) The projection onto x3 is the disjunction of the following two constraint sets: alldiff(x3), atmost(x3, {a, f, g}, 1) alldiff(x3), D1, . . . , D3 = {a, b, c}, {c, d}, {d, f} References [Beame et al., 2003] P. Beame, H. Kautz, and A. Sabharwal. Understanding the power of clause learning. In International Joint Conference on Artificial Intelligence (IJCAI 2003), 2003. [Beldiceanu and Contejean, 1994] N. Beldiceanu and E. Contejean. Introducing global constraints in CHIP. Mathematical and Computer Modelling, 12:97 123, 1994. [Benders, 1962] J. F. Benders. Partitioning procedures for solving mixed-variables programming problems. Numerische Mathematik, 4:238 252, 1962. [Bergman et al., 2011] D. Bergman, W.-J. van Hoeve, and J. N. Hooker. Manipulating MDD relaxations for combinatorial optimization. In CPAIOR 2011 Proceedings, pages 20 35, 2011. [Bergman et al., 2014] D. Bergman, A. Cire, A. Sabharwal, H. Samulowitz, V. Sarswat, and W.-J. van Hoeve. Parallel combinatorial optimization with decision diagrams. In CPAIOR 2012 Proceedings, pages 351 367, 2014. [Bergman et al., 2016] D. Bergman, A. A. Cire, W.-J. van Hoeve, and J. N. Hooker. Discrete optimization with binary decision diagrams. INFORMS Journal on Computing, 28:47 66, 2016. [Boole, 1854] G. Boole. An Investigation of the Laws of Thought, On Which are Founded the Mathematical Theories of Logic and Probabilities. Walton and Maberly, London, 1854. [Cir e and van Hoeve, 2012] A. A. Cir e and W.-J. van Hoeve. MDD propagation for disjunctive scheduling. In Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS), pages 11 19, 2012. [Cire and van Hoeve, 2013] A. A. Cire and W.-J. van Hoeve. Multivalued decision diagrams for sequencing problems. Operations Research, 61:1411 1428, 2013. [Freuder, 1982] E. C. Freuder. A sufficient condition for backtrack-free search. Communications of the ACM, 29:24 32, 1982. [Hailperin, 1976] T. Hailperin. Boole s Logic and Probabil- ity, volume 85 of Studies in Logic and the Foundations of Mathematics. North-Holland, Amsterdam, 1976. [Hansen and Perron, 2008] P. Hansen and S. Perron. Merg- ing the local and global approaches to probabilistic satisfiability. International Journal of Approximate Reasoning, 47:125 140, 2008. [Hooker and Ottosson, 2003] J. N. Hooker and G. Ottosson. Logic-based Benders decomposition. Mathematical Programming, 96:33 60, 2003. [Hooker, 1988a] J. N. Hooker. Generalized resolution and cutting planes. Annals of Operations Research, 12:217 239, 1988. [Hooker, 1988b] J. N. Hooker. A mathematical programming model for probabilistic logic. Working paper 05-88-89, Graduate School of Industrial Administration, Carnegie Mellon University, 1988. [Hooker, 1992a] J. N. Hooker. Generalized resolution for 0- 1 linear inequalities. Annals of Mathematics and Artificial Intelligence, 6:271 286, 1992. [Hooker, 1992b] J. N. Hooker. Logical inference and poly- hedral projection. In Computer Science Logic Conference (CSL 1991), volume 626 of Lecture Notes in Computer Science, pages 184 200. Springer, 1992. [Hooker, 2000] J. N. Hooker. Logic-Based Methods for Op- timization: Combining Optimization and Constraint Satisfaction. Wiley, New York, 2000. [Hooker, 2007] J. N. Hooker. Planning and scheduling by logic-based Benders decomposition. Operations Research, 55:588 602, 2007. [Hooker, 2012] J. N. Hooker. Integrated Methods for Opti- mization, 2nd ed. Springer, 2012. [Hooker, 2016] J. N. Hooker. Projection, consistency, and George Boole. Constraints, 21:59 76, 2016. [Jaumard et al., 1991] B. Jaumard, P. Hansen, and M. P. Arag ao. Column generation methods for probabilistic logic. INFORMS Journal on Computing, 3:135 148, 1991. [Klinov and Parsia, 2013] P. Klinov and B. Parsia. Pronto: A practical probabilistic description logic reasoner. In Uncertainty Reasoning for the Semantic Web II (URSW 2008 2010), pages 59 79, 2013. [Nilsson, 1986] N. J. Nilsson. Probabilistic logic. Artificial Intelligence, 28:71 87, 1986. [Pesant, 2004] G. Pesant. A regular language membership constraint for finite sequences of variables. In Principles and Practice of Constraint Programming (CP 2004), pages 482 495, 2004. [Quine, 1952] W. V. Quine. The problem of simplifying truth functions. American Mathematical Monthly, 59:521 531, 1952. [Quine, 1955] W. V. Quine. A way to simplify truth func- tions. American Mathematical Monthly, 62:627 631, 1955. [R egin and Puget, 1997] J.-C. R egin and J.-F. Puget. A fil- tering algorithm for global sequencing constraints. In Principles and Practice of Constraint Programming (CP 1997), pages 32 46, 1997. [van Hoeve et al., 2006] W.-J. van Hoeve, G. Pesant, L.-M. Rousseau, and A. Sabharwal. Revisiting the sequence constraint. In Principles and Practice of Constraint Programming (CP 2006), pages 620 634, 2006. [Williams and Hooker, 2014] H. P. Williams and J. N. Hooker. Integer programming as projection. Working paper LSEOR 13.143, London School of Economics, 2014. [Williams, 1987] H. P. Williams. Linear and integer programming applied to the propositional calculus. International Journal of Systems Research and Information Science, 2:81 100, 1987.