# temporal_planning_with_clockbased_smt_encodings__9a9c3f3a.pdf

Temporal Planning with Clock-Based SMT Encodings

Jussi Rintanen Aalto University, Department of Computer Science, Helsinki, Finland

We propose more scalable encodings of temporal planning in SMT. The ﬁrst contribution is practical clock-based encodings of resources and effect delays. Existing encodings of effect delays (Shin and Davis, 2015) have a quadratic size, due to the necessity to determine the time differences between steps for a linear number of steps. Clocks improve this to linear. The second contribution is a new relaxed scheme for steps. Existing schemes require a step for every time point with discontinuous change. This is relaxed, improving scalability.

1 Introduction

After the successes of SAT in solving the classical planning problem [Kautz and Selman, 1996; 1999], Shin and Davis [2005] proposed the encodings of temporal and hybrid systems planning in the SAT modulo Theories (SMT) framework. While the work solves conceptual problems of temporal planning with SMT, it has not proved successful in terms of performance. Shin & Davis adopted a temporal model with ϵ-separation [Fox and Long, 2003], which can double the number of steps, with a high performance penalty for constraint-based methods [Rintanen, 2015b]. ITSAT [Rankooh and Ghassem-Sani, 2015] probably the best scalable temporal planner avoids the problems of ϵ-separation by reducing the temporal problem to a non-temporal one. ITSAT, however, often generates plans with makespans twice the optimal, which in practice is unacceptable. Rintanen [2015a] adopts a temporal model without ϵ-separation and with opportunities for discretization to integer time, which is not possible with ϵ-separation. The ﬁrst application of SMT was classical planning with numeric variables [Wolfman and Weld, 1999]. SMT was little used before the recent works on temporal planning, and very recently on continuous change [Bryce et al., 2015].

Also afﬁliated with Grifﬁth University, Brisbane, Australia, and the Helsinki Institute for Information Technology, Finland. This work was funded by the Academy of Finland (Finnish Centre of Excellence in Computational Inference Research COIN, 251170). We acknowledge the computational resources provided by the Aalto Science-IT project.

In this work, we pursue the fundamentals of temporal (and hybrid) planning further, believing that full temporal model is needed during search to achieve good-quality plans. First, we address the number of steps in SMT encodings, analogous to the number of time points in SAT encodings of classical planning [Kautz and Selman, 1996]. Shin and Davis [2005] require a step for every time point in which an action or a discrete change takes place. We propose a relaxed scheme in which one step can summarize discrete changes in multiple preceding time points. Second, we address the asymptotic size of encodings, reducing a encoding of time delays from quadratic to linear, while avoiding the excessive number of clocks of earlier encodings [Rintanen, 2015a]. The resulting encodings improve earlier state of the art in SMT-based temporal planning, solving dozens more of the standard benchmark problems, and sometimes improving runtimes by two orders of magnitude.

2 Model of Temporal Planning We adopt the formal framework of Rintanen [2015a]. A temporal action consists of a precondition φ (a propositional formula), and effects which are negated or unnegated state variables associated with times 0, indicating how much after the action the effect will take place. When the action is at absolute time t, then an effect for time t is at absolute time t+t . We call t the delay. An action can allocate a resource for a duration d0, at some time t0 relative to the starting point of the action: the resource is allocated for the interval t+t0 to t+t0+d0, and no other action can take place if it allocates the same resource for an intersecting interval.

Deﬁnition 1 (Actions with Resources) Let X be a ﬁnite set of state variables and R a ﬁnite set of resources. An action is a triple p,q,e where the precondition p is a propositional formula over X, the resource requirement q is a set of tuples (ts,td,r) Q Q R such that td 0, and (ts,td,r,n) Q Q R N such that td 0, and the effect e is a set of pairs (t,l) where t 0 is a rational number and l is a literal over X.

We refer to the precondition p of an action a by prec(a), the resource requirement q by rreq(a), and the effects e by

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

eff(a). The set R of resources is divided in two types. Unary resources, expressed by triples (ts,td,r), model absolute exclusion inside a group of actions: only one action can allocate r at a time. State resources, expressed by tuples (ts,td,r,n), where n is the state, can be used by multiple actions as long as the state n is the same.

Deﬁnition 2 (Plans and Executions with Resources) For a problem instance X,R,I,A,G , with state variables X, resources R, initial state I, actions A, and goal G, a plan π is a ﬁnite set of pairs (a,t) such that the following holds.

1. t 0 is a rational number and a A is an action.

2. For all {(t1,a1),(t2,a2)} π such that for some r R

(ts 1,td 1,r) rreq(a1) and (ts 2,td 2,r) rreq(a2), or (ts 1,td 1,r,n1) rreq(a1), (ts 2,td 2,r,n2) rreq(a2), and n1 =n2,

(a) t1+ts 1+td 1 t2+ts 2, or (b) t2+ts 2+td 2 t1+ts 1.

3. There is an execution v:Q X {0,1} which is a mapping from non-negative rational time points and state variables to 0 and 1 such that

(a) v(0,x)=I(x) for all x X, (b) v(t,prec(a))=1 for all (a,t) π (c) if (a,t) π and (t ,l) eff(a), then v(t+t ,l)=1, (d) state variables not changed by actions retain their values: for any tl and tu such that tl<tu, if v(tl,x)=1, and there is no (a,t ) π such that (t , x) eff(a) and tl<t +t tu then v(ti,x)=1 for all ti such that tl<ti tu. (Analogously for v(tl,x)=0.)

4. There is t such that v(t ,G)=1 for all t >t.

Effects (0,l) that make preconditions of simultaneous actions true (including the action itself) are not allowed.

3 SMT Encodings of Temporal Planning

Timelines in temporal planning are continuous, but it is sufﬁcient to represent explicitly only a ﬁnite sequence of time points in which an action starts or an effect takes place, called steps. For Boolean state variables X={x1,...,xn}, actions A={a1,...,am}, and N +1 steps, we need SMT variables x@i and a@i for all x X, a A, and i {0,...,N}. For each step, these SMT variables indicate the values of state variables and whether an action is taken. Variables τ@i denote the absolute time at step i, and @i=τ@i τ@(i 1). We constrain these values by @i>0. If φ is the precondition of action a, we have the formula

a@i φ@i (1)

where φ@i is the formula obtained from φ by replacing each x by x@i. By causes(x)@i we denote the disjunction of all the conditions under which x becomes true at step i. These

formulas typically refer to atomic propositions for earlier steps. Similarly causes( x)@i for x becoming false. Hence when causes(l)@i is true, the effect l takes place.

causes(x)@i x@i (2) causes( x)@i x@i (3)

Frame axioms allow inferring that the value of a state variable remains unchanged.

(x@i x@(i 1)) causes(x)@i (4) ( x@i x@(i 1)) causes( x)@i (5)

How the disjuncts of causes(x)@i are deﬁned depends on when the action causing the change takes place, and how the time difference between the action and the effect is expressed. Here we ﬁrst present the Shin&Davis style encoding of delay, and will consider other options later. A disjunct of causes(x)@i for effects x at a relative time t>0 of an action a is

j=0 (a@j ((τ@i τ@j)=t)) (6)

if encoded in the spirit of Shin and Davis [2005] formula (4.3). These constraints have a quadratic size. When t>0, to guarantee that there is a step for every effect, we need

j=i+1 (τ@j τ@i=t). (7)

3.1 Resource Constraints Consider actions a1 and a2 such that (t1,d1,r) rreq(a1) and (t2,d2,r) rreq(a2). If the intervals ]t1,t1+d1[ and ]t2,t2+ d2[ overlap, then the actions cannot be at the same step. The following constraint prevents these two actions from starting at the same step.

a1@i a2@i (8)

Now consider the case in which a1 may have been taken earlier at time 0 t, and this may prevent a2 from taken at the current step at time 0. The requirement that the allocations of the resource by the two actions do not overlap means that the relative time t where the action a1 may have been taken satisﬁes one of the following.

0 t+t1+d1 t2 0 t+t1 t2+d2

This means that either the ﬁrst action frees the resource before the second allocates it, or vice versa. Simpliﬁed these constraints are as follows. t1+d1 t2 t t1 t2 d2 t

Hence a2 is not allowed if a1 is at ] t1+t2+d2, t1 d1+ t2[ relative to the current time point.

a1@j (t1 t2 d2> i j>t1+d1 t2) (9)

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

Here i j=Pi k=j+1 @k, or, equivalently, i j=τ@i τ@j. If t1<t2+d2, then testing the bound t1 t2 d2 is unnecessary, because no matter how recently the ﬁrst action was taken, that action can never allocate the resource after a2 freed it. Similarly, if t2 t1+d1, the earlier action must already have freed the resource before the second action could allocate, requiring no constraints. Constraints on state resources are similar. A constraint for a pair of actions is required only if the actions allocate the same state resource with different states. In all of the above constraints, a reference to a single action (like a1@i) can be replaced by a disjunction of actions (like a1@i a2@i an@i), when all these actions allocate the same resource for the same interval (and state.)

3.2 Effect Delays with Clocks Rintanen [2015a] devised clock-based encodings of delays, and then pointed out that there will be impractically many real-valued variables, hampering efﬁcient SMT solving. Here we brieﬂy explain the use of clocks, and in Section 6 we propose a practical scheme for sharing clocks between actions. The values of a clock c at different steps i are represented by SMT variables c@i. A clock c associated with action a is initialized to zero when the action is taken a@i (c@i=0) (10) and the value of the clock is increased at all other steps. Here i {1,...,N}. a@i (c@i=c@(i 1)+ @i) (11) Trigger for effect x with delay t in causes(x)@i is now1

c@i=t. (12) To guarantee that there is a step where (12) is true, we need (c@(i 1)<t) (c@i t). (13) One clock per action can be insufﬁcient if an action can overlap itself. Self-overlap is atypical, but when it is required, there are three options. In speciﬁc situations one clock can still sufﬁce (see Section 5). If an action can selfoverlap only a bounded number of times, a bounded number of clocks is sufﬁcient. In case of unbounded self-overlap [Rintanen, 2007], a fall-back position would be to use a clock-free scheme like that of Shin and Davis.

4 Summarized Effects A main problem with the encodings is the high number of steps, which increases solver runtimes. In the above encodings, there must be a step for actions starting points, and also for any time point where a state variable changes.

Example 1 Consider simultaneously taking three actions respectively with durations 1, 2, and 3, and respectively with effects x, y and z, to reach the goal x y z. With all encodings considered before, four steps are needed (see Figure 1, left): the starting step of all actions, and steps for the relative time points 1, 2, and 3, in which respectively x, y and z become true, and with the goal x y z true at the last step.

1If an action can immediately follow itself, here c@(i 1)+ @i must replace c@i. Same ﬁx is later needed in (14), (15) and (16).

Figure 1: Steps for effects are not needed

Figure 2: Plan with fewer steps may have a longer makespan

We propose a new scheme, in which a step is not necessary for time points with change, and effect axioms are relaxed to force an effect at a given step if its time is greater than or equal to the change time. Essentially, the inﬁnitely dense line of real or rational time points has to be made explicit as steps only at those time points in which an action is started, or before such effect takes place that contradicts an earlier effect which has not been recorded at a step yet. In the above example, this scheme allows a 2-step plan in which all three effects, respectively for time points 1, 2 and 3, are recorded at the second step, associated with any time point 3, as illustrated in Figure 1, right. This scheme is easy to implement as a modiﬁcation of the clock-based encoding of effect delays presented earlier. We only need in causes(x)@i for effects (l,t) the formula

(c@i t) (c@(i 1)<t), (14)

relaxing and replacing (12), and we don t need formula (13). This scheme creates new trade-offs between the makespans and the number of steps, as illustrated in Figure 2. The relaxed scheme may resemble the idea of parallel plans in classical planning [Kautz and Selman, 1996; Rintanen et al., 2006; Wehrle and Rintanen, 2007], where multiple actions are allowed at the same step whenever they are independent and can be ordered to a total order. Our scheme allows merging multiple steps, but still assumes a (non-strict) total ordering on the starting points of actions.

5 Clocks for Resource Constraints We next present encodings of resource constraints expressed in terms of clocks. Encodings of resource constraints presented earlier are based on pairwise exclusions of pairs of actions or pairs of action sets. This types of encodings can be quadratic in the worst case, although with many temporal planning problems they can be improved to close to linear by lumping together all actions that allocate a given resource for the same (relative) time interval. Our ﬁrst observation is that clocks help achieve O(n) size encodings, as opposed to quadratic size O(n2) as in formula (9). As the number of resources is low (some dozens for standard benchmarks), associating the clocks with them is much more practical than associating them with actions.

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

The core idea is that clocks can express resource allocations through their value in two different ways. First, a clock value in the interval ]t,t+d[ denotes the allocation of the corresponding resource for a ﬁxed duration d starting from a relative time point t counted from the resetting of the clock to 0 at the step where the action is taken. Second, a clock value <0 can denote an allocation of any duration extending from the current time point (and earlier) until the clock reaches value 0. In this case, an action allocating the resource for duration d from its beginning resets the clock to value d. The second option covers almost all resources in standard temporal planning benchmark problems, which may allocate resources for different durations but in almost all cases do it from the beginning of an action. Shared clocks cannot always accurately represent resource allocations. Allocations of the same state resource by different actions for different durations cannot be handled by one clock. Similarly for other resources, in speciﬁc cases. When clocks cannot be used, the fall-back strategy is to use the formulas (9).

Example 2 Consider an action that allocates a unary resource for interval ]9,10[. Assume this action does not use any other resource. Assume this action is taken at time points 0 and 1. The resource clock is reset to 0 at both time points. Now the value of the clock from time point 1 on only represents the second instance of this action. Taking an action that requires the same resource for the relative interval ]0,1[ would be possible at time 9, which is not correct.

Example 3 If all actions that allocate a given resource allocate it for interval ]9,10[, then all resource conﬂicts can be detected by using one clock. Assume an action is taken at 0. By checking the clock, it can be determined that a second instance of the same action is not possible before time 1, at which point the clock is reset.

An important special case, which covers almost all resource conﬂicts in almost all of the standard benchmark problems for temporal planning, is the following. Let all actions allocate a resource at the action s starting point (relative time 0), for durations da, dependent on the action a. The resource clock c is reset to da when action a is taken. An action causes a resource conﬂict at a given time point iff c 0. Next we set out to devise a general encoding scheme which uses clocks whenever possible and handles all cases correctly. We consider two cases.

1. If at least one action allocates a resource r at relative time 0 (the starting point of the action), use clock c0 r. If action a allocates r with (0,d,r), then at the starting point of the action the clock c0 r is reset to d. 2. For any other allocation (t,d,r) (with t>0) occurring at least once, use clock ct,d r . Any action allocating r with (t,d,r) resets ct,d r to 0.

Next we discuss the conditions under which resource conﬂicts between actions can be detected by using the clocks associated with the resource allocations. Note that constraints on actions taken at the same step are always handled by static

constraints as described earlier, and here we only address constraints on actions taken at different steps. Consider actions a1 and a2 (this includes the case a1=a2) which respectively allocate the same unary resource r with (0,d1,r) and (t2,d2,r). The constraint for handling conﬂicts of this form is the following. a2@i c0 r@i t2 (15) Consider actions a1 and a2 which respectively allocate the same unary resource r with (t1,d1,r) and (t2,d2,r). This is the general case, which cannot always be handled with clocks. The question is whether the clock ct1,d1 r represent a sufﬁcient amount of information of earlier allocations (t1,d1,r) of r so that the conﬂict with allocations (t2,d2,r) can be detected with it, by using the following constraint. a2@i t1+d1 ct1,d1 r @i t2

t1 ct1,d1 r @i t2+d2 (16) It turns out that the clock is insufﬁcient only if the immediately preceding action allocates the resource relatively later than the current action does, and some still earlier action allocates the resource in a conﬂicting way.

Proposition 1 The clock ct1,d1 r is sufﬁcient if t2+d2 t1.

Proof: Consider action a2, taken at time t and allocating r with (t2,d2,r), as well as actions a0 and a1, both of which allocate the unary resource r with (t1,d1,r), with a1 taken at some time t <t and a0 at some time t such that t +d1 t (because both use the resource for duration d1). We claim that if the allocation of r by a2 at time t 0 does not conﬂict with the allocation by a1, then it will not conﬂict with the allocation by a0 either. Hence the conﬂict can be detected from the clock which only represents the allocation by the later action a1 but not by a0. Since a1 and a2 do not conﬂict, their intervals of allocation do not overlap, that is, t +t1+d1 t+t2 or t +t1 t+t2+d2. By assumption t2+d2 t1 and t>t , and hence the second condition cannot be true, that is, a1 could not possibly allocate the resource after a2 has freed it. Therefore the only possibility of not having a conﬂict is that t +t1+d1 t+t2. Now, because t <t we have t +t1+d1 t+t2. Hence a2 does not conﬂict with a0 either.

Same considerations apply to state resources. In summary, the general scheme for deriving resource constraints is as follows. For a given resource r, we consider each possible pair of allocations (t1,d1,r) and (t2,d2,r) at a time (with all actions making the same allocation considered together), and create a constraint to prevent the conﬂict between them. If t1=0, then we can always use formula (15). If t1>0, then we use Proposition 1 to check if the clock ct1,d1 with formula (16) is sufﬁcient, and if not, we use formula (9) instead. For standard benchmarks this scheme generally leads to the very compact constraints (15).

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

6 Shared Clocks for Effect Delays Section 3.2 described how can could be used for representing effects delay (for relative time points >0). The high number of real-valued variables required makes that representation impractical. However, a different kind of scheme for using clocks typically avoids the high number of clocks. The basic observation is that two (or more) actions can share a clock whenever the actions cannot temporally overlap. Temporal overlap is not possible when the actions use the same exclusive resource for their whole duration.

Example 4 Consider actions for moving an object from location to location. To prevent taking two actions moving the object from one location to two different locations, all these action allocate the same resource speciﬁc to the object for their whole duration.

Now, in cases like above we can use one clock for a large number of actions. The clock is reset when an action is taken, and the actions effects take place when the clock reaches a value corresponding to the delay of the effect. The basic idea is to use the clock for those resource allocations that start at relative time point 0 of the action, and that last until the last effect of the action. The actions cannot overlap because they use the same unary resource for their whole duration. Hence this one clock can be shared by multiple actions. This way, practically all standard benchmark problems require only some dozens of clocks.

6.1 Qualitative Clocks Since a shared clock does not tell which action is active, we must use qualitative (Boolean) clocks to do that. These indicate a qualitative value range for the clock (a single value, or a range of values). Note that while the clock is initialized to d when the action allocates the resource with (t,d,r), the time values of the qualitative clocks run from 0 until d, for clarity. The constraints for connecting the real-valued clock to the qualitative clocks are given as (17)-(19), (27), (28), (36). The connection between the real-valued clock c and the qualitative clock cq representing the time since the start of the action, to the reset value of the clock da for this action, is

Let T ={t1,...,tn} be time points, for example those (relative) time points in which the effects of a given action take place.We use the propositional variables, qa <ti@j and qa ti@j, for i {1,...,n} to abstractly represent the clock value c satisfying respectively ti 1<c<ti and c=ti. The variables qa <ti and qa ti are related to the real-valued clock in the obvious way (where c+da=cq as discussed above.)

qa <ti@j (ti 1<c@j+da) (17) qa <ti@j (c@j+da<ti) (18) qa ti@j (c@j+da=ti) (19)

Qualitative clocks are also useful because reasoning about the clock values is possible before any of the relevant realvalued variables have been set a value by the SMT solver. To

support these inferences we use the following formulas for all i {1,...,n 1} and k {2,...,n}.

a@j qa <t1@(j+1) qa t1@(j+1) (20) qa <ti@j qa <ti@(j+1) qa ti@(j+1) (21) qa ti@j qa <ti+1@(j+1) qa ti+1@(j+1) (22)

qa <tk@j qa <tk@(j 1) qa tk 1@(j 1) (23)

qa tk@j qa <tk@(j 1) qa tk 1@(j 1) (24)

qa <t1@j qa <t1@(j 1) a@(j 1) (25) qa t1@j qa <t1@(j 1) a@(j 1) (26)

All the qualitative clock variables for a given step are mutually exclusive. Further, when the qualitative clocks of different actions represent the same real-valued clock, all the qualitative clock variables also for all these different actions are mutually exclusive.

6.2 Summarized Effects For the encodings that allow changes at time points not made explicit as a step, as proposed earlier, we introduce variables

for indicating that the clock reaches value ti at step j or between steps j 1 and j. These variables are related to the clock values as follows.

qa ti@j (c@(j 1)+da+ i ti) (27)

qa ti@j (c@(j 1)+da<ti) (28)

The qa ti variables are used with qa <ti (and without qa ti, replacing them), and they connect together more loosely. We have for all i {1,...,n 1} and k {2,...,n}:

a@j qa <t1@(j+1) qa t1@(j+1) (29)

qa <ti@j qa <ti@(j+1) qa ti@(j+1) (30)

qa <tk@j qa <tk@(j 1) qa tk 1@(j 1) (31)

qa <t1@j qa <t1@(j 1) a@(j 1) (32) qa ti@j qa ti+1@j qa <ti+1@(j+1) qa ti+1@(j+1)(33)

qa tk@j qa tk 1@j qa <tk@(j 1) qa tk 1@(j 1) (34)

qa t1@j qa <t1@(j 1) a@(j 1) (35)

Note that we can have qa ti@j and qa tj@j true at the same step j, even for multiple tj, i =j. Additionally, variables qa <ti satisfy the following.

qa <ti@j (c@(j+1)+da<ti) (36)

In formulas causes(x)@i the qualitative clocks are used similarly to real-valued action-speciﬁc clocks. Effect triggering in effect and frame axioms (2) and (5) in different encoding styles is summarized below.

disjunct of causes(x)@i for clock type c@i=t own qa t @i shared c@i t c@(i 1)<t own, summarization qa t@i shared, summarization

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

ITSAT SD C R 08-crewplanning 30 30 10 14 15 08-elevators 30 16 4 6 9 08-elevators-num 30 - 4 8 13 08-openstacks 30 30 4 5 7 08-pegsol 30 30 30 30 30 08-sokoban 30 17 17 17 16 08-transport 30 - 4 6 8 08-woodworking 30 - 16 15 23 08-openstacks-adl 30 - 3 5 8 08-openstacks-num-adl 30 - 5 9 18 11-ﬂoortile 20 20 20 20 20 11-matchcellar 10 10 10 10 10 11-parking 40 9 12 12 12 11-storage 20 10 0 0 0 11-tms 20 20 20 20 20 11-turnandopen 20 20 18 18 18 14-ﬂoortile 20 20 20 20 20 14-matchcellar 20 20 19 20 19 14-parking 20 18 19 19 19 14-tms 20 20 20 20 20 14-turnandopen 20 9 5 5 5 14-driverlog 30 4 0 0 0 total (w/o numeric) 410 303 228 236 240 total 560 303 260 279 310

Table 1: Instances solved in 1800 seconds by domain

Only for the trigger c@i=t we need to force a step for time t (formula (13)). For triggers qa t @i this follows from the axioms for qualitative clocks, requiring that all qualitative clock-value ranges are visited in turn.

7 Experiments

We compare our result to the Shin-Davis style step scheme and ITSAT which is one of the the strongest temporal planners [Rankooh and Ghassem-Sani, 2015], shown to outperform earlier planners [Gerevini et al., 2006; Coles et al., 2010; Eyerich et al., 2012; Lu et al., 2013]. Experimentation was based on Rintanen s [2015a] code, including the discretization method to eliminate time variables whenever possible. We used Math SAT 5.3.6 [Audemard et al., 2005; Cimatti et al., 2013] for instances with real-valued variables, and Preco SAT [Biere, 2010] for purely Boolean ones. Experiments were run in Intel Xeon CPUs. From the SMT approach, SD is the baseline encoding [Rintanen, 2015a]. C is obtained from SD by using the clockbased encodings of resource constraints and delays (Sections 5 and 6). R additionally uses summarized steps (Section 4). Table 1 lists the number of IPC instances solved in 1800 seconds. ITSAT doesn t handle numeric variables and cannot solve the problems indicated with a dash. Differences between SD, C and R are not clearly visible here: many problem series are solved (almost) completely by all planners, some series are too difﬁcult, and e.g. Parking gets fully discretized and all three planners use the same SAT encoding. Figure 3 plots makespans for instances solved by both ITSAT and R, with ITSAT makespans often close to twice those of R. R makespans are higher only with few instances of TMS. On these, C makespans are the same as ITSAT s. Figure 4 shows

50 100 150 200 250

Figure 3: Comparison of makespans ITSAT vs. R

1 10 100 1000

Runtime in seconds

Figure 4: Comparison of runtimes ITSAT vs. R

that ITSAT s overall runtime advantage is not very systematic. Other data shows that R quite systematically improves on C, but not on SD (despite overall improvement).

8 Conclusion We proposed new ways of using clocks in SMT encodings of temporal planning, decreasing the asymptotic size of some of the core constraints for temporal planning, and showed that the number of steps a main factor in SAT/SMT solver performance can be reduced with a relaxed encoding scheme. Both contributions improve the scalability of SMT in temporal planning. Our experiments showed that although the scalability and runtime improvements are not yet sufﬁcient to match the state of the art, represented by the ITSAT planner, plan quality is much better than with ITSAT, which often generates plans with a makespan twice or more of the optimal. Future work includes further investigation of schemes for reducing the complexity of handling temporal dependencies between actions, as well as better implementation technologies for temporal planning in general, for example following the lines that have proved successful with classical planning [Rintanen, 2012a; 2012b].

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

[Audemard et al., 2005] Gilles Audemard, Marco Bozzano, Alessandro Cimatti, and Roberto Sebastiani. Verifying industrial hybrid systems with Math SAT. Electronic Notes in Theoretical Computer Science, 119(2):17 32, 2005.

[Biere, 2010] Armin Biere. Lingeling, Plingeling, Pico SAT and Preco SAT at SAT Race 2010. Technical Report 1, Institute for Formal Models and Veriﬁcation, Johannes Kepler University, 2010.

[Bryce et al., 2015] Daniel Bryce, Sicun Gao, David J Musliner, and Robert P. Goldman. SMT-based nonlinear PDDL+ planning. In Proceedings of the 29th AAAI Conference on Artiﬁcial Intelligence (AAAI-15), pages 3247 3253. AAAI Press, 2015.

[Cimatti et al., 2013] Alessandro Cimatti, Alberto Griggio, Bastiaan Joost Schaafsma, and Roberto Sebastiani. The Math SAT5 SMT solver. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 93 107. Springer-Verlag, 2013.

[Coles et al., 2010] Amanda Jane Coles, Andrew Coles, Maria Fox, and Derek Long. Forward-chaining partialorder planning. In ICAPS 2010. Proceedings of the Twentieth International Conference on Automated Planning and Scheduling, pages 42 49. AAAI Press, 2010.

[Eyerich et al., 2012] Patrick Eyerich, Robert Mattm uller, and Gabriele R oger. Using the context-enhanced additive heuristic for temporal and numeric planning. In Towards Service Robots for Everyday Environments, pages 49 64. Springer-Verlag, 2012.

[Fox and Long, 2003] Maria Fox and Derek Long. PDDL2.1: an extension to PDDL for expressing temporal planning domains. Journal of Artiﬁcial Intelligence Research, 20:61 124, 2003.

[Gerevini et al., 2006] Alfonso Gerevini, Alessandro Saetti, and Ivan Serina. An approach to temporal planning and scheduling in domains with predictable exogenous events. Journal of Artiﬁcial Intelligence Research, 25:187 231, 2006.

[Kautz and Selman, 1996] Henry Kautz and Bart Selman. Pushing the envelope: planning, propositional logic, and stochastic search. In Proceedings of the 13th National Conference on Artiﬁcial Intelligence and the 8th Innovative Applications of Artiﬁcial Intelligence Conference, pages 1194 1201. AAAI Press, 1996.

[Kautz and Selman, 1999] Henry Kautz and Bart Selman. Unifying SAT-based and graph-based planning. In Proceedings of the 16th International Joint Conference on Artiﬁcial Intelligence, pages 318 325. Morgan Kaufmann Publishers, 1999.

[Lu et al., 2013] Qiang Lu, Ruoyun Huang, Yixin Chen, You Xu, Weixiong Zhang, and Guoliang Chen. A SAT-based approach to cost-sensitive temporally expressive planning. ACM Transactions on Intelligent Systems and Technology (TIST), 5(1):18, 2013.

[Rankooh and Ghassem-Sani, 2015] Masood Feyzbakhsh Rankooh and Gholamreza Ghassem-Sani. ITSAT: an efﬁcient SAT-based temporal planner. Journal of Artiﬁcial Intelligence Research, 53:541 632, 2015. [Rintanen et al., 2006] Jussi Rintanen, Keijo Heljanko, and Ilkka Niemel a. Planning as satisﬁability: parallel plans and algorithms for plan search. Artiﬁcial Intelligence, 170(12-13):1031 1080, 2006. [Rintanen, 2007] Jussi Rintanen. Complexity of concurrent temporal planning. In ICAPS 2007. Proceedings of the Seventeenth International Conference on Automated Planning and Scheduling, pages 280 287. AAAI Press, 2007. [Rintanen, 2012a] Jussi Rintanen. Engineering efﬁcient planners with SAT. In ECAI 2012. Proceedings of the 20th European Conference on Artiﬁcial Intelligence, pages 684 689. IOS Press, 2012. [Rintanen, 2012b] Jussi Rintanen. Planning as satisﬁability: heuristics. Artiﬁcial Intelligence, 193:45 86, 2012. [Rintanen, 2015a] Jussi Rintanen. Discretization of temporal models with application to planning with SMT. In Proceedings of the 29th AAAI Conference on Artiﬁcial Intelligence (AAAI-15), pages 3349 3355. AAAI Press, 2015. [Rintanen, 2015b] Jussi Rintanen. Models of action concurrency in temporal planning. In IJCAI 2015, Proceedings of the 24th International Joint Conference on Artiﬁcial Intelligence, pages 1659 1665. AAAI Press, 2015. [Shin and Davis, 2005] Ji-Ae Shin and Ernest Davis. Processes and continuous change in a SAT-based planner. Artiﬁcial Intelligence, 166(1):194 253, 2005. [Wehrle and Rintanen, 2007] Martin Wehrle and Jussi Rintanen. Planning as satisﬁability with relaxed -step plans. In AI 2007 : Advances in Artiﬁcial Intelligence: 20th Australian Joint Conference on Artiﬁcial Intelligence, Surfers Paradise, Gold Coast, Australia, December 2-6, 2007, Proceedings, number 4830 in Lecture Notes in Computer Science, pages 244 253. Springer-Verlag, 2007. [Wolfman and Weld, 1999] Steven A. Wolfman and Daniel S. Weld. The LPSAT engine & its application to resource planning. In Proceedings of the 16th International Joint Conference on Artiﬁcial Intelligence, pages 310 315. Morgan Kaufmann Publishers, 1999.

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)