# propagating_regular_counting_constraints__33310e71.pdf

Propagating Regular Counting Constraints

Nicolas Beldiceanu TASC team (CNRS/INRIA) Mines de Nantes 44307 Nantes, France

Pierre Flener and Justin Pearson Department of Information Technology Uppsala University 751 05 Uppsala, Sweden

Pascal Van Hentenryck Optimization Research Group, NICTA and Australian National University Canberra, Australia

Constraints over ﬁnite sequences of variables are ubiquitous in sequencing and timetabling. This led to general modelling techniques and generic propagators, often based on deterministic ﬁnite automata (DFA) and their extensions. We consider counter-DFAs (c DFA), which provide concise models for regular counting constraints, that is constraints over the number of times a regular-language pattern occurs in a sequence. We show how to enforce domain consistency in polynomial time for at-most and at-least regular counting constraints based on the frequent case of a c DFA with only accepting states and a single counter that can be increased by transitions. We also show that the satisfaction of exact regular counting constraints is NP-hard and that an incomplete propagator for exact regular counting constraints is faster and provides more pruning than the existing propagator from (Beldiceanu, Carlsson, and Petit 2004). Finally, by avoiding the unrolling of the c DFA used by COSTREGULAR, the space complexity reduces from O(n |Σ| |Q|) to O(n (|Σ| + |Q|)), where Σ is the alphabet and Q the state set of the c DFA.

1 Introduction

Constraints over ﬁnite sequences of variables arise in many sequencing and timetabling applications. The last decade has witnessed signiﬁcant research on how to model and propagate, in a generic way, idiosyncratic constraints that are often featured in these applications. The resulting modelling techniques are often based on formal languages and, in particular, deterministic ﬁnite automata (DFA). Indeed, DFAs are a convenient tool to model a wide variety of constraints, and their associated propagators can enforce domain consistency in polynomial time (Beldiceanu, Carlsson, and Petit 2004; Pesant 2004). This paper is concerned with the concept of counter DFA (c DFA), an extension of DFAs initially proposed in (Beldiceanu, Carlsson, and Petit 2004), and uses it to model regular counting constraints, that is constraints on the number of regular-language patterns occurring in a sequence of variables. c DFAs typically result in more concise and natural encodings of regular counting constraints compared to DFAs, but, to our knowledge, there is no published

Copyright c 2014, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

proof that they do not admit efﬁcient propagators enforcing domain consistency. This paper originated as an attempt to clarify this issue and to overcome the practical limitation of current ﬁltering algorithms due to a large memory consumption stemming from the explicit unrolling of the automaton (Pesant 2004; Demassey, Pesant, and Rousseau 2006). We consider the subset of c DFAs satisfying two conditions: (1) all their states are accepting, and (2) they manipulate a single counter that can be increased by transitions. These conditions are satisﬁed for many regular counting constraints and offer a good compromise between expressiveness and efﬁciency. Our ﬁrst contribution is to show that for such a c DFA A it is possible to enforce domain consistency efﬁciently on at-most and at-least regular counting constraints. The constraint REGCOUNTATMOST(N, X, A) holds if the counter of A is at most N after A has consumed sequence X. The constraint REGCOUNTATLEAST(N, X, A) is deﬁned similarly. We also show the NP-hardness of feasibility testing for the constraint REGCOUNT(N, X, A), which holds if the counter of A is exactly N after A has consumed X. Compared to the constraint COSTREGULAR (Demassey, Pesant, and Rousseau 2006), as generalised for the Choco solver (Choco 2012), our second contribution is a propagator for exact regular counting that uses asymptotically less space (for its internal data structures) and yet propagates more on the variables of X. Furthermore, our propagators for at-most and at-least regular counting achieve domain consistency on the counter variable N (and X) in the same asymptotic time as the COSTREGULAR propagator achieves only bounds consistency on N (but also domain consistency on X). The rest of the paper is organised as follows. Section 2 deﬁnes c DFAs and regular counting constraints. Section 3 gives the propagator, its complexity, and its evaluation. Section 4 discusses related work and Section 5 concludes.

2 Background

Deterministic Finite Counter Automata

Recall that a deterministic ﬁnite automaton (DFA) is a tuple Q, Σ, δ, q0, F , where Q is the set of states, Σ is the alphabet, δ: Q Σ Q is the transition function, q0 Q is the start state, and F Q is the set of accepting states.

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence

ϵ {k 0} a aa

a b {k k + 1}

Figure 1: Counter-DFA AAB for the constraint NUMBERWORD(N, X, aab )

This paper considers a subclass of counter-DFAs in which all states are accepting and only one counter is used. The counter is initialised to 0 and increases by a given natural number at every transition. Such an automaton accepts every string and assigns a value to its counter. More formally, a counter-DFA (c DFA) is here speciﬁed as a tuple Q, Σ, δ, q0, F , where Q, Σ, q0, and F are as in a DFA except that F = Q and the DFA transition function δ is extended to the signature Q Σ Q N, so that δ(q, ℓ) = r, inc indicates that r is the successor state of state q upon reading alphabet symbol ℓand that the counter is increased by inc. (All our results generalise to inc R+, that is weighted regular counting.) We also deﬁne two projections of this extended transition function: if δ(q, ℓ) = r, inc , then δQ(q, ℓ) = r and δN(q, ℓ) = inc.

Given δ(q, ℓ) = r, inc , we denote by C(q ℓ r) the counter

increase inc of transition q ℓ r from state q to state r upon consuming symbol ℓ. Similarly, we denote by C(q σ r) the counter increase of a path q σ r from state q to state r upon consuming a (possibly empty) string σ.

Example 1. Consider the automaton AAB in Figure 1. It represents a c DFA with state set Q = {ϵ, a, aa} and alphabet Σ = {a, b}. The transition function δ is given by the labelled arcs between states, and the start state is q0 = ϵ (indicated by an arc coming from no state; we often denote the start state by ϵ, because it can be reached by consuming the empty string ϵ). Since the ﬁnal states F are all the states in Q, this automaton recognises every string over {a, b} and is thus by itself not very interesting. However, the c DFA features a counter k that is initialised to 0 at the start state, increased by 1 on the transition from state aa to state ϵ upon reading symbol b , and increased by 0 on all other transitions. As a result, the ﬁnal value of k is the number of occurrences of the word aab within the string.

Regular Counting Constraints

A regular counting constraint is deﬁned as a constraint that can be modelled by a c DFA. The REGCOUNT(N, X, A) constraint holds if the value of variable N, called the counter variable, is equal to the ﬁnal value of the counter after c DFA A has consumed the values of the entire sequence X of variables. Consider the constraint NUMBERWORD(N, X, w), which holds if N is the number of occurrences of the non-empty word w in the sequence X of variables. The constraint NUMBERWORD(N, X, aab ) can be modelled by the con-

xi V {k k + 1}

Figure 2: Counter-DFA AMONG for the constraint AMONG(N, X, V)

straint REGCOUNT(N, X, AAB) with the automaton AAB speciﬁed in Figure 1.

Signature Constraints A constraint on a sequence X of variables can sometimes be modelled with the help of a DFA or c DFA that operates not on X, but on a sequence of signature variables that functionally depend via signature constraints on a sliding window of variables within X (Beldiceanu, Carlsson, and Petit 2004). For example, the AMONG(N, X, V) constraint (Beldiceanu and Contejean 1994) requires N to be the number of variables in the sequence X that are assigned a value from the given set V. With signature constraints xi V si = 1 and xi / V si = 0 (with xi X), we obtain a sequence of |X| signature variables si that can be used in a c DFA that counts the number of occurrences of value 1 in that sequence. (The choice of AMONG is purely pedagogical: we do not claim this is the best way to model and propagate this constraint.) Rather than labelling the transitions of such a c DFA with values of the domain of the signature variables (the set {0, 1} here), we label them with the corresponding conditions of the signature constraints, giving the c DFA in Figure 2. If each signature variable depends on a sliding window of size 1 within X (as for AMONG), then the signature constraints are unary. Our results also apply to c DFAs with unary signature constraints because a network of a REGCOUNTATMOST constraint and unary signature constraints is Berge-acyclic.

3 The Propagator Feasibility Test and Domain Consistency Filtering Our propagator is deﬁned in terms of the following concepts, which assume a sequence x1, . . . , xn of variables with domains denoted by dom(xi):

Deﬁne P(i) (respectively P(i)) to be the set of pairs q, c where c is the minimum (respectively maximum) counter increase (or value) after the automaton consumes any preﬁx string σ = σ1 σi from state q0 to reach state q, with i [0, n] and σj dom(xj) for each j [1, i].

Deﬁne S(i) (respectively S(i)) to be the set of pairs q, c where c is the minimum (respectively maximum) counter increase after the automaton consumes any sufﬁx string σ = σi σn from state q to reach a state appearing in P(n) (respectively P(n)), with i [1, n + 1] and σj dom(xj) for each j [i, n]. Example 2. By illustrating one representative of these four quantities, we show that we have to maintain the maximum

r {k k + 1}

s r {k k + 2}

r {k k + 2}

Figure 3: Counter-DFA RST

counter value for every state reachable from q0 in i steps, rather than just maintaining the overall maximum counter value and the set of states reachable from q0 in i steps. Consider the automaton RST in Figure 3, where q0 is ϵ. In a sequence of n = 6 variables x1, . . . , x6 that must be assigned value r or t , we have:

P(0) = { ϵ, 0 }

P(1) = { ϵ, 0 , r, 1 }

P(2) = { ϵ, 1 , r, 1 , rr, 1 }

P(3) = { ϵ, 1 , r, 2 , rr, 1 , rrt, 1 }

P(4) = { ϵ, 2 , r, 2 , rr, 2 , rrt, 1 , rrtr, 3 }

P(5) = { ϵ, 2 , r, 3 , rr, 3 , rrt, 2 , rrtr, 3 }

P(6) = { ϵ, 3 , r, 3 , rr, 3 , rrt, 3 , rrtr, 4 }

Indeed, rrtr, 4 P(6) because rrt, 2 P(5) and there is a transition in A from rrt to rrtr on symbol r with a counter increase of 2, even though three states have a higher counter value (namely 3) than rrt in P(5). A decomposition using sliding transition constraints where the state and counter variables are not explicitly linked, such as in (Beldiceanu, Carlsson, and Petit 2004), has no link between the reached state at level i and the corresponding counter value, and thus hinders propagation. To compute P(i) and S(i), we need an operation that takes a set of state-and-integer pairs and keeps only the pairs q, c where there is no pair q, c with c < c. Formally, trim Min(S) = { q, c S | q, c S : c < c}. For brevity, we use trim Min φ(q,c) ( q, c ) to denote

trim Min({ q, c | φ(q, c)}), for any condition φ. We inductively deﬁne P(i) and S(i) as follows:

{ q0, 0 } if i = 0 trim Min q,c P(i 1) ℓ dom(xi) ( δQ(q, ℓ), c + δN(q, ℓ) ) if i [1, n]

{ q, 0 | c N : q, c P(n)} if i = n + 1 trim Min q ,c S(i+1) ℓ dom(xi) δ(q,ℓ)= q ,inc

( q, c + inc ) if i [1, n]

We show that the inductively computed quantities correspond to the deﬁnitions of P(i) and S(i). First consider P(i). The base case P(0) follows from the initialisation to zero of the counter. By induction, suppose the set P(i 1) is correct. Before applying trim Min, the set contains all pairs obtained upon reading the symbol ℓstarting from some pair q, c in P(i 1), where c is the minimum counter value for q over sequences of length i 1. The trim Min operation then ﬁlters out all the pairs q , c with non-minimum counter value for q . The correctness proof for S(i) is similar. We deﬁne the REGCOUNTATMOST(N, X, A) propagator. The propagator for REGCOUNTATLEAST is similar. The following theorem gives a feasibility test. Theorem 1. A REGCOUNTATMOST(N, [x1, . . . , xn], A) constraint has a solution iff the minimum value of the counter of A after consuming the entire sequence is at most the maximum of the domain of N:

min q,c P(n) c max(dom(N))

Proof. Suppose c is the minimum counter value such that q, c P(n) for some state q. By the deﬁnition of P(n), there is some sequence σ = σ1 σn where for all 1 j n the symbol σj belongs to dom(xj) such that C(q0 σ q) = c. Because each σj belongs to the domain of the corresponding variable, we have that σ is a solution to REGCOUNTATMOST iff c max(dom(N)).

We now show how to achieve domain consistency (DC).

Theorem 2. For a REGCOUNTATMOST(N, [x1, . . . , xn], A) constraint, deﬁne the minimum value of the counter of A for variable xi to take value ℓ:

m(i, ℓ) = min q,c P(i 1) , q =δQ(q,ℓ) , q ,c S(i+1)

(c + δN(q, ℓ) + c )

A value ℓin dom(xi) (with i [1, n]) appears in a solution iff the minimum value of the counter is at most the maximum of the domain of N: m(i, ℓ) max(dom(N)). A value in dom(N) appears in a solution iff it is at least the minimum counter value given in Theorem 1.

Proof. We start with the ﬁrst claim. (If) We show that any ℓ dom(xi) with m(i, ℓ) max(dom(N)) participates in a solution. Suppose m(i, ℓ) equals c + δN(q, ℓ) + c for some q, c P(i 1) and some q , c S(i + 1), with q = δQ(q, ℓ). Then there exist two strings σ = σ1 σi 1 and τ = σi+1 σn and some state qn such that

C(q0 σ q) = c and C(q τ qn) = c

with σj dom(xj) for all j [1, n]. Note that the length of σℓτ is n. We have:

C(q0 σℓτ qn) = C(q0 σ q) + δN(q, ℓ) + C(q τ qn)

= c + δN(q, ℓ) + c

= m(i, ℓ) max(dom(N)).

Hence the assignment corresponding to σℓτ satisﬁes the domains and the constraint, so ℓ dom(xi) participates in a

solution. (Only if) If ℓ dom(xi) participates in a solution, then the counter of that solution is at least m(i, ℓ) and at most max(dom(N)), hence m(i, ℓ) max(dom(N)). The second claim follows from Theorem 1. Indeed, let c = min q,c P(n) c So there exists a sequence σ = σ1 σn with each σj dom(xj) such that C(q0 σ q) = c. Further, for any σ = σ 1 σ n with each σ j dom(xj) for all j

[1, n], we have c C(q0 σ q n) for some state q n. So, by Theorem 1, we need to prove that v dom(N) participates in a solution iff c v. (Only if) If v dom(N) participates in a solution, then there exists a sequence σ = σ 1 σ n such that each σ j dom(xj) and C(q0 σ q n) v. Since

c C(q0 σ q n), we have c v. (If) If c v, then the sequence σ above necessarily also forms a solution with N = v.

The algorithm of an idempotent propagator consists of the feasibility test of Theorem 1 and the pruning of the values from dom(N) and all dom(xi) that do not satisfy the conditions of Theorem 2, upon ﬁrst computing its m(i, ℓ) values.

Complexity The complexity of a non-incremental implementation of the propagator is established as follows. Recall that we consider sequences of n variables xi, each with at most the automaton alphabet Σ as domain. Let the automaton have |Q| states. Each set P(i) has O(|Q|) elements and takes O(|Σ| |Q|) time to construct and trim (assuming it is implemented as a counter-value array indexed by Q, with all cells initialised to + ). There are n+1 such sets, hence the entire P( ) vector takes O(n |Σ| |Q|) time and Θ(n |Q|) space. Similarly, the entire S( ) vector takes O(n |Σ| |Q|) time and Θ(n |Q|) space. Each value m(i, ℓ) takes O(|Q|) time to construct, since at most |Q| pairs q, c of P(i 1) are iterated over and the corresponding pair q , c is unique and can be retrieved in constant time (under the assumed data structure). There are n |Σ| such values, hence the entire m( , ) matrix takes O(n |Σ| |Q|) time and Θ(n |Σ|) space. Each test of a domain value takes constant time, hence Θ(n + 1) time in total for the n variables xi and the counter variable N. In total, such an implementation takes O(n |Σ| |Q|) time, and Θ(n (|Q| + |Σ|)) space.

The Exact Regular Counting Constraint Not surprisingly, decomposing REGCOUNT(N, X, A) into the conjunction of REGCOUNTATMOST(N, X, A) and REGCOUNTATLEAST(N, X, A) does not yield DC at the ﬁxpoint of their propagators: for the c DFA B in Figure 4 and the constraint REGCOUNT(N, [2, x, 2], B), with N {0, 1, 2} and x {1, 2}, it misses the inference of N = 1. Worse, achieving DC on exact regular counting is NP-hard:

Theorem 3. The feasibility of REGCOUNT is NP-hard.

Proof. By reduction from Subset-Sum. Consider an instance {a1, . . . , ak}, s of Subset-Sum, which holds if there is a subset A {a1, . . . , ak} such that P v A v = s. Construct a c DFA A with one state and alphabet Σ =

1 2 1 2 {k k + 1}

Figure 4: Counter-DFA B, where domain consistency is not achieved for exact regular counting

{a1, . . . , ak, 0}. A transition labelled by ai increases the counter by ai, and the transition labelled by 0 does not increase the counter. Build a sequence of variables X = x1, . . . , xk such that dom(xi) = {0, ai} and a variable N such that dom(N) = {s}. Such a reduction can be done in polynomial time. Subset-Sum holds iff REGCOUNT(N, X, A) holds.

The propagator for REGCOUNTATMOST can be generalised into an incomplete propagator for REGCOUNT. A value ℓis removed from the domain of variable xi if the following condition holds for all q, c P(i 1):

q,c P(i 1) q =δQ(q,ℓ) q ,c S(i+1) q ,c S(i+1)

c + δN(q, ℓ) + c , c + δN(q, ℓ) + c

This propagator has the same space complexity as REGCOUNTATMOST, but it may need more than one run to achieve idempotency. Indeed, it differs from the previous propagator in that lower and upper bounds have to be calculated for each state in P(i 1), and it is possible that some states will give different bounds. Hence the ﬁrst run of the propagator might not reach idempotency. The propagator is strictly stronger than computing the ﬁxpoint of REGCOUNTATMOST and REGCOUNTATLEAST, because the intersection test with respect to dom(N) is strictly stronger than the conjunction of the two comparisons on the at-most and at-least sides: for the c DFA in Figure 4 and the constraint REGCOUNT(1, [2, x, 1, y, z], B), with x, y, z {1, 2}, the REGCOUNT propagator infers z = 2, whereas the decomposition misses this inference. The REGCOUNT propagator is also incomplete: the counterexample before Theorem 3 for the decomposition also applies to it.

Evaluation We implemented in SICStus Prolog version 4.2.1 (Carlsson, Ottosson, and Carlson 1997) the described propagators for REGCOUNTATMOST, REGCOUNTATLEAST, and REGCOUNT. The full source code is in (Beldiceanu et al. 2013). We evaluated their merits as follows. We generated random c DFAs of up to ﬁve states (it is very important to note that all 34 counter automata of the Global Constraint Catalogue (Beldiceanu et al. 2007) have at most ﬁve states, since counters are a very powerful device that allows a drastic reduction from the number of states needed by using a conventional DFA) using the random DFA generator (Almeida, Moreira, and Reis 2007) of

FAdo (version 0.9.6) and doing a counter increase by 1 on each arc with a probability of 20%. For each random c DFA, we generated random instances, with random lengths (up to n = 10) of X = [x1, . . . , xn] and random initial domains of the counter variable N (one value, two values, and intervals of length 2 or 3) and the signature variables si (intervals of any length, and sets with holes). The results, upon many millions of random instances, are that our REGCOUNT propagator never propagates less but often more, to the point of detecting more failures, than the built-in AUTOMATON (Beldiceanu, Carlsson, and Petit 2004) of SICStus Prolog. Further, it is already up to 3 times faster than the latter, even though it is na ıvely implemented in Prolog, while the built-in works by decomposition into a conjunction of TABLE constraints (with tuples according to the transition function δ), which is very carefully implemented in C. There is thus strong reason to believe that our propagators will do better on any benchmark. Also, no counterexample to the domain consistency of REGCOUNTATMOST has been generated and no pruning by the propagators of actually supported values was observed, giving a sanity check while proving Theorems 1 and 2. Table 1 gives the cumulative runtimes (under Mac OS X 10.7.5 on a 2.8 GHz Intel Core 2 Duo with a 4 GB RAM), the numbers of detected failures, and (when both propagators succeed) the numbers of pruned values for random instances of some constraints, the four-state c DFA for the NUMBERWORD(N, X, toto ) constraint being unnecessary to reproduce here. To demonstrate the power of our propagators, we have also tested them on constraints whose counter-DFAs have binary signature constraints, so that our at-least and atmost regular counting propagators may not achieve domain consistency, because they were designed for unary signature constraints. For example, the INFLEXION(N, X) constraint holds if there are N inﬂexions (local optima) in the integer sequence X; a c DFA is given in (Beldiceanu et al. 2007), with signature constraints using the predicates xi{<, =, >}xi+1 on the sliding window [xi, xi+1] of size 2. Our exact regular counting propagator outperforms the builtin AUTOMATON (Beldiceanu, Carlsson, and Petit 2004) of SICStus Prolog, as shown in the last line of Table 1. Further, our instance generator has not yet constructed any counterexample to domain consistency on at-most regular counting.

4 Related Work Our regular counting constraints are related to COSTREGULAR(X, A, N, C) (Demassey, Pesant, and Rousseau 2006), an extension of the REGULAR(X, A) constraint (Pesant 2004): a ground instance holds if the sum of the variable-value assignment costs is exactly N after DFA A has accepted X, where the two-dimensional cost matrix C, indexed by Σ and X, gives the costs of assigning each value of the alphabet Σ of A to each variable of the sequence X. Indeed, both the abstractions and the underlying propagators of regular counting constraints and COSTREGULAR are closely related. However, we now argue that regular counting constraints sometimes provide both a more natural abstraction and some computational

beneﬁts, namely more propagation and asymptotically less space, within the same asymptotic time. At the conceptual level, the regular counting and COSTREGULAR constraints differ in how costs are expressed. In the COSTREGULAR constraint the costs are associated with variable-value assignments, while in regular counting constraints the costs (seen as counter increases) are associated with the transitions of the counter automaton. This is an important conceptual distinction, as counter automata provide a more natural and compact abstraction for a variety of constraints, where the focus is on counting rather than costing. Footnote 1 of (Demassey, Pesant, and Rousseau 2006, page 318) points out that the cost matrix C can be made three-dimensional, indexed also by the states Q of A, but this is not discussed further in (Demassey, Pesant, and Rousseau 2006). This allows the expression of costs on transitions, and it seems that this has no impact on the time complexity of their propagator. This generalisation is implemented in the Choco solver (Choco 2012). It is only with such a three-dimensional cost matrix that it is possible for the modeller to post a regular counting constraint by using the COSTREGULAR constraint: ﬁrst unroll the counter automaton for the length |X| into a directed acyclic weighted graph G (as described in (Pesant 2004), and the counter increases become the weights) and then post COSTREGULAR(X, N, G); the Choco implementation (Choco 2012, page 95) of COSTREGULAR features this option. The alternative is to read the three-dimensional cost matrix C off A only (since X is not needed) and to post COSTREGULAR(X, A , N, C), where DFA A is counter DFA A stripped of its counter increases. Either way, this encoding is not particularly convenient and it seems natural to adopt counter automata as an abstraction. Also note that the cost matrix C has to be computed for every different value of |X| that occurs in the problem model, while this is not the case with counter automata. Essentially, regular counting is a specialisation of the generalised COSTREGULAR constraint (with a three-dimensional cost matrix), obtained by projecting the generalised cost matrix onto two different dimensions than in the original COSTREGULAR constraint, namely Q and Σ, and using it to extend the transition function of the DFA to the signature Q Σ Q N and calling the extended DFA a c DFA. At the efﬁciency level, regular counting constraints are propagated using dynamic programming, like COSTREGULAR. This is not surprising. The time complexity is the same as for the encoding using COSTREGULAR, namely O(n |Σ| |Q|), where n = |X|, as the unrolling of the c DFA takes the same time as the propagator itself. It is interesting however to note that the structure of regular counting constraints enables a better space complexity thanks to the compactness of a counter automaton as the input data structure, as well as fundamentally different internal data structures: we do not store the unrolled automaton. Indeed, we have shown that regular counting constraints have a space complexity of Θ(n (|Q| + |Σ|)), while the encoding by COSTREGULAR constraint takes Θ(n |Σ| |Q|) space, to store either the three-dimensional cost matrix C or the unrolled graph G (Choco allows both

seconds failures prunings Constraint #instances REGCOUNT AUTO REGCOUNT AUTO REGCOUNT AUTO AMONG(N, X, V) 4,400 0.8 1.8 2,241 2,241 3,749 3,749 NUMBERWORD(N, X, aab ) 13,200 0.9 2.3 4,060 4,020 1,294 943 NUMBERWORD(N, X, toto ) 17,600 0.9 2.7 4,446 4,435 1,149 663 REGCOUNT(N, X, RST ) 13,200 3.9 7.3 5,669 4,333 13,213 2,275 INFLEXION(N, X) 13,200 3.7 7.5 4,447 4,066 9,279 4,531

Table 1: Comparison between REGCOUNT and the AUTO(MATON) constraint of SICStus Prolog

ways of parametrising COSTREGULAR). Note that adding a counter to a DFA bears no asymptotic space overhead on the representation of the DFA. At the consistency level, note that our at-most and atleast regular counting propagators achieve domain consistency on the counter variable N in the same asymptotic time as the COSTREGULAR propagator (Demassey, Pesant, and Rousseau 2006; Choco 2012) achieves only bounds consistency on N. The claim by (Demassey, Pesant, and Rousseau 2006; Choco 2012) that their propagator achieves domain consistency on X is invalidated by Theorem 3 (stressing the need for propagator sanity checks like ours, even with random instances), hence this would only hold for at-most and at-least variants of COSTREGULAR: for the c DFA B in Figure 4 and the constraint REGCOUNT(N, [2, 2, x, 2, y], B), with N {1, 3} and x, y {1, 2}, their propagator misses the inference of y = 2, and so does our propagator for exact regular counting. Our data structures are more compact (see above), and yet enable more propagation on X for exact regular counting. To summarise, although regular counting constraints and the COSTREGULAR constraint are closely related, we believe that our results contribute both to our understanding of these constraints and to the practice in the ﬁeld. The SEQBIN constraint (Petit, Beldiceanu, and Lorca 2011; Katsirelos, Narodytska, and Walsh 2012) can be represented by a regular counting constraint, but it would require non-unary signature constraints. The METER constraint (Roy and Pachet 2013) does not take an automaton but constraints on two consecutive variables. It has only variables: the cost of a variable is its value, while counter increases or costs are distinguished from the variables for both REGCOUNT and COSTREGULAR. It does not provide a counter variable but allows partial sum constraints. It achieves domain consistency in pseudopolynomial time.

5 Conclusion

We consider regular counting constraints over ﬁnite variable sequences, which are ubiquitous and very diverse in sequencing and timetabling (e.g., restricting the number of monthly working weekends or two-day-periods where a nurse works during a night followed by an afternoon). We study a class of deterministic ﬁnite automata with counters (c DFA) that allows more concise models for regular counting constraints than representations using standard DFAs. Our ﬁrst contribution is to show how to enforce domain

consistency in polynomial time for at-most and at-least regular counting constraints, even for the counter variable, based on the frequent case of a c DFA with only accepting states and a single counter that can be increased by transitions. We also show that deciding the feasibility of exact regular counting constraints is NP-hard. Our second contribution is to reduce the space complexity from O(n |Σ| |Q|) to O(n (|Σ| + |Q|)) by not explicitly representing the unrolled c DFA.

It is possible to lift our restriction to counter automata where all states are accepting, even though we are then technically outside the realm of regular counting. For instance, this would allow us to constrain the number N of occurrences of some pattern, recognised by c DFA A1, in a sequence X of variables, while X is not allowed to contain any occurrence of another pattern, recognised by c DFA A2. Rather than decomposing this constraint into the conjunction of REGCOUNT(N, X, A1) and REGCOUNT(0, X, A2), with poor propagation through the shared variables, we can design a c DFA A12 that counts the number of occurrences of the ﬁrst pattern and fails at any occurrence of the second pattern (instead of counting them) and post the unique constraint REGCOUNT(N, X, A12), after using the following recipe. Add an accepting state, say q, and an alphabet symbol, say $, whose meaning is end-of-string. Add transitions on $ from all existing accepting states to q, with counter increase by zero. Add transitions on $ from all non-accepting states to q, with counter increase by a suitably large number, such as max(dom(N)) + 1. Make the non-accepting states accepting. Append the symbol $ to the sequence X when posting the constraint, so that the extended automaton never actually stops in a state different from q, thereby making it irrelevant whether the original states are accepting or not.

Open issues include the following questions. Can we implement our propagators to run in O(n |Σ|) time? Which c DFAs admit a propagator achieving domain consistency for exact regular counting? Can we generalise our domainconsistency result to non-unary signature constraints? We conjecture our results extend to non-deterministic counter automata: our notation was merely simpliﬁed by requiring that the transition function δQ return a single state.

Other constraints, such as the one of (Bart ak 2002), can be used to encode counters: a comparison is future work.

Since our propagator essentially uses linear constraints, we strongly believe it is also relevant in the context of linear programming, as in (Cˆot e, Gendron, and Rousseau 2007).

Acknowledgements The ﬁrst author is supported by INRIA s associate-team programme TASCMELB. The second and third authors are supported by grants 2011-6133 and 2012-4908 of the Swedish Research Council (VR). NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program. We wish to thank Mats Carlsson for help with SICStus Prolog, Joseph Scott for help with FAdo, and Arnaud Letort for help with performing the test with COSTREGULAR in Choco. We also thank the reviewers for their constructive comments.

References Almeida, M.; Moreira, N.; and Reis, R. 2007. Enumeration and generation with a string automata representation. Theoretical Computer Science 387(2):93 102. The FAdo tool is available at http://fado.dcc.fc.up.pt/. Bart ak, R. 2002. Modelling resource transitions in constraint-based scheduling. In Grosky, W. I., and Pl aˇsil, F., eds., SOFSEM 2002, volume 2540 of LNCS, 186 194. Springer. Beldiceanu, N., and Contejean, E. 1994. Introducing global constraints in CHIP. Journal of Mathematical and Computer Modelling 20(12):97 123. Beldiceanu, N.; Carlsson, M.; Demassey, S.; and Petit, T. 2007. Global constraint catalogue: Past, present, and future. Constraints 12(1):21 62. The current working version of the catalogue is at http://www.emn.fr/z-info/sdemasse/aux/ doc/catalog1.pdf. Beldiceanu, N.; Flener, P.; Pearson, J.; and Van Hentenryck, P. 2013. Propagating regular counting constraints. Technical Report 1309.7145, Computing Research Repository. Available at http://arxiv.org/abs/1309.7145. Beldiceanu, N.; Carlsson, M.; and Petit, T. 2004. Deriving ﬁltering algorithms from constraint checkers. In Wallace, M., ed., CP 2004, volume 3258 of LNCS, 107 122. Springer. Carlsson, M.; Ottosson, G.; and Carlson, B. 1997. An openended ﬁnite domain constraint solver. In Glaser, H.; Hartel, P.; and Kuchen, H., eds., PLILP 1997, volume 1292 of LNCS, 191 206. Springer. The solver is at http://sicstus. sics.se. Choco. 2012. Choco solver: Documentation, version 2.1.5. Available at http://choco.emn.fr/. Cˆot e, M.-C.; Gendron, B.; and Rousseau, L.-M. 2007. Modeling the Regular constraint with integer programming. In Van Hentenryck, P., and Wolsey, L. A., eds., CPAIOR 2007, volume 4510 of LNCS, 29 43. Springer. Demassey, S.; Pesant, G.; and Rousseau, L.-M. 2006. A Cost-Regular based hybrid column generation approach. Constraints 11(4):315 333. Katsirelos, G.; Narodytska, N.; and Walsh, T. 2012. The SEQBIN constraint revisited. In Milano, M., ed., CP 2012, volume 7514 of LNCS, 332 347. Springer.

Pesant, G. 2004. A regular language membership constraint for ﬁnite sequences of variables. In Wallace, M., ed., CP 2004, volume 3258 of LNCS, 482 495. Springer. Petit, T.; Beldiceanu, N.; and Lorca, X. 2011. A generalized arc-consistency algorithm for a class of counting constraints. In IJCAI 2011, 643 648. IJCAI/AAAI. Corrected version at http://arxiv.org/abs/1110.4719. Roy, P., and Pachet, F. 2013. Enforcing meter in ﬁnite-length Markov sequences. In des Jardins, M., and Littman, M. L., eds., AAAI 2013, 854 861. AAAI Press.