# parallel_behavior_composition_for_manufacturing__13c6d646.pdf

Parallel Behavior Composition for Manufacturing

Paolo Felli University of Nottingham, UK

paolo.felli@nottingham.ac.uk

Brian Logan University of Nottingham, UK

bsl@cs.nott.ac.uk

Sebastian Sardina RMIT University, Australia sebastian.sardina@rmit.edu.au

A key problem in the manufacture of highlycustomized products is the synthesis of controllers able to manufacture any instance of a given product type on a given production or assembly line. In this paper, we extend classical AI behavior composition to manufacturing settings. We ﬁrst introduce a novel solution concept for manufacturing composition, target production processes, that are able to manufacture multiple instances of a product simultaneously in a given production plant. We then propose a technique for synthesizing the largest target production process, together with an associated controller for the machines in the plant.

1 Introduction

Manufacturing companies are increasingly faced with demands for variable volumes of high quality customised products, produced rapidly and at low cost [Foresight, 2013]. One way of meeting such demands is through increased automation, and in particular allowing production control software greater autonomy in determining how products will be manufactured. The application of autonomous systems in manufacturing has the potential to increase productivity, ﬂexibility and reliability, add value, and compensate for an ageing skilled workforce. However, the complexity of such autonomous manufacturing systems has so far prevented their widespread adoption. A key challenge in realising the potential of autonomous manufacturing is moving from humanauthoring of the production control software that speciﬁes how a particular product should be made, to the automated synthesis of controllers that are able to manufacture any instance of a given product type on a particular production or assembly line.

The problem of synthesising complex (virtual) systems from simple existing modules has been addressed in AI, where it is referred to as the behavior composition problem [De Giacomo et al., 2013]. The composition task involves synthesizing a controller to coordinate a set of available behavior modules (e.g., automatic lights or blinds, TVs, microwaves, etc.) so as to implement a novel target behavior module (e.g., a smart house system). Standard behavior

composition assumes a single sequential execution of the target behavior module. While adequate for domains such as web-services [Calvanese et al., 2008], these approaches are not applicable to the manufacturing domain.

In manufacturing, the target represents a process speciﬁed by a production recipe, which lists the steps necessary to manufacture items of a particular product type, and the order in which those steps should be executed. The recipe encompasses all variations of the product type, e.g., variations in size, color or material, and decisions about how a particular item should be manufactured are made at run-time, based on order information associated with the item, e.g., whether the item is to be blue or green. The steps in a recipe are enacted by production resources, e.g., welding machines, painting machines, etc. For example, in garment manufacture, a production recipe may specify how to make shirts of various colors and sizes, and the production resources may be fabric cutting, sewing and pressing machines. In general, a recipe consists of multiple steps, and (depending on available production resources) it is often possible to start work on manufacturing another item before the items currently being manufactured have been completed. The aim is to construct a target production process that allows many instances of a product type (ideally as many as possible) to be produced at the same time on the available production resources. Critically, such a process must allow all variations of the product type speciﬁed by the recipe for each instance produced, as the speciﬁcation of each item (e.g., whether it will be blue or green) is only known at run-time.

In this paper, we extend behavior composition [De Giacomo et al., 2013; Stroeder and Pagnucco, 2009; Lustig and Vardi, 2009] to manufacturing settings. We formalise both production recipes and production resources, and propose a technique for synthesizing a target production process and controller capable of orchestrating the behaviors of the production resources to produce multiple instances of a product type in accordance with the recipe. To the best of our knowledge, our work is the ﬁrst to consider the synthesis of controllers for multiple concurrent instances of a target process.

2 Production Recipes and Cycles We begin by presenting the notions used to specify how items of a particular product type are manufactured. A production recipe captures the basic ﬁnite process required to manufac-

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16)

Figure 1: A simple product recipe R and its production cycle.

ture any instance (e.g., a blue shirt of size XL) of a particular product type (e.g., shirts).

Deﬁnition 1. A production recipe is a tuple R = (L,l0,A,λR,F) where (i) L is ﬁnite set of states and l0 L is the initial state; (ii) A is a ﬁnite set of actions; (iii) λR L A L is an acyclic transition function with all ﬁnal states being sink states; (iii) F L is the set of ﬁnal states.

A production recipe may be arbitrarily large and include options for which action to perform in a state that encode the different ways in which a product can be manufactured (e.g., due to variations within a product type). For example, once the fabric of a shirt has been cut, it may be dyed green, red, or blue: which color should be used for a particular product item is decided at run-time and is outside the model (e.g., by a human operator, or determined by information carried by the item itself, e.g., in the form of a radio-frequency identiﬁcation (RFID) tag specifying the color). If a recipe is actiondeterministic, i.e., only one action is available in each state of R, this indicates that there is only one possible way of manufacturing the product type.

We assume that recipes are acyclic and terminating : the manufacture of every product item requires a bounded number of steps.1 For example, producing an item of the product type speciﬁed by the recipe R depicted in Figure 1 requires performing operation (e.g., cut fabric) followed by either action β (e.g., dye blue) or γ (e.g., dye green), after which the item is deemed manufactured (states Y and Z are ﬁnal). Hence R speciﬁes two different manufacturing alternatives or product variations. As explained above, whether β or γ will be performed for a particular item is decided at run-time, externally to the recipe.

While recipes specify how one item of a product type is to be manufactured to completion, a production plant aims to manufacture many instances of a product simultaneously, over and over, in principle forever. This continuous process is captured through the notion of production cycle.

Deﬁnition 2. Given a product recipe R = (L,l0,A,λR,F), the production cycle induced by R is a tuple LR = (L F,l0,A,λ) such that λ(l) = λR(l) for all l L F, and λ(l) = l0 for each l F.

Intuitively, a production cycle represents the repeated execution of a recipe. To represent the completion of one item and the beginning of the next item of the product type, we add a transition from the ﬁnal states to the initial state of the recipe. When an item returns to the initial state (through a

1It is straightforward to add constructs specifying the bounded repetition of a subprocess, e.g., to model the re-trying of a process (clean subprocess) until success (product sensed clean), and exiting after some number of repetitions (discard product).

sequence of transitions within its production cycle), it is considered as completed, and the cycle can be restarted representing a new item of the product type entering the production line. For instance, the production cycle LR in Figure 1 is induced by the recipe R.

When L is a production cycle, we denote by L the extended cycle obtained from L by adding extra self-loops with the special action A in every state. The ( no-op ) transition is used to represent a pause step.

We will use the notion of traces in the usual way to represent legal runs of recipes and cycles. For example, a trace of a production cycle R is a ﬁnite sequence = l0a1l1a2 a l

(or just = l0 a1

l ), such that lj 1,aj,lj λ for all 1 j . When is a trace as above, last( ) denotes its last state t . Finally, unless stated otherwise, we assume that traces start from the initial state (in this case, l0 = l0).

3 Production Plants

Production recipes are enacted by a production plant, consisting of production resources and their associated capabilities (e.g., cutting, sewing, knitting, and pressing machines in a clothing factory). Following [De Giacomo et al., 2013], we model those capabilities as available behaviors of the form B = (B,b0,A,R) where B is ﬁnite set of states, b0 B is the initial state, A is the set of actions (we assume A), and R B A B is B s transition relation. Behaviors may be nondeterministic, and so are only partially controllable, i.e., once a behavior is instructed to execute an action, its evolution cannot be controlled. For example, a painting machine can non-deterministically evolve to a state signaling out-ofpaint after painting an item.

A production plant is composed of m 1 available behaviors, and is formally captured as the synchronous product of those behaviors. (We assume that the production plant is ﬁxed, as it represents the capabilities of the available production resources.) From now on, we denote by indx the set of vectors k ({1,...,m})m in which the components ki are pairwise distinct. As is customary, when X is a set, n 1, Xk = X X denotes the k cross-product of X; and when t is a tuple of size k, we use ti to denote its i-th component, for each 1 i k. When A is a set of actions, we use Ak

= (A { })k ({ })k to denote the set of vectors of k actions extended to include the distinguished no-op action.

Deﬁnition 3. Given a set of m 1 behaviors {B1,...,Bm}, with Bi = (Bi,b0i,A,Ri) for each 1 i m, the production plant system is a tuple S = (S,s0,Am

S = B1 Bm is the set of states;

s0 = b01,...,b0m is S s initial state;

indx S is S s transition relation such that b1,...,bm a k b

m in iff for each 1 i m, if ai = , then b

i = bi; otherwise b

ki Rk(bk,aki).

Intuitively, a transition b1,...,bm a k b

m in S captures the delegation of each action aj (including possibly no-op actions) to the available behavior Bkj, which evolves

, , β, , , ,

β, , β, , γ , β, γ

β, , , , β,

Figure 2: A plant S = B1,B2 . For compactness, ,γ in S stands for both ,γ 1,2 and γ, 2,1 .

from current state b

kj to successor state b

kj. As an example, Figure 2 shows two available behaviors and the resulting production plant system S.

As for production cycles, traces of production plant systems are sequences of the form s0 a1

s , which we refer to as histories. We use H to denote the set of histories of S.

3.1 Plant Controllers Intuitively, a production plant is operated as follows. At any point in time, the user or operator of the plant which may be human, software, or a hybrid of the two requests the execution of up to m domain actions, where m is the number of available behaviors in the plant. A controller then delegates each domain action to an available behavior in the plant that is capable of handling the action in the current situation. The designated behaviors then execute their assigned actions and evolve as speciﬁed by their transition relation, yielding the new state of the plant, from where a new request is issued and so on. The objective is to build a controller that always fulﬁlls the user s requests by appropriate delegation of domain actions.

Deﬁnition 4. Given a production plant system S with m 1 available behaviors, a plant controller (or controller) is a function

which delegates a set of actions a Am

to available behaviors in S: if P(h,a) = k, then P delegates action ai to available behavior Bki, for i {1,...,m}.

Note that our notion of controller extends that of De Giacomo et al. (2013) to multiple actions, as we assume a production plant can perform more than one operation at a time, due to the concurrent operation of production resources. However, since each resource can carry out one operation at a time, in a plant with m resources, one can execute up to m actions simultaneously.

4 Target Production Processes

With the notions of production recipe and production plant system in hand, we are now ready to present one of the main contributions of the paper, namely target production processes, as a solution concept for manufacturing composition. Since production plants aim to manufacture not one, but several items of a given product type simultaneously, the natural

β, β γ, γ β, γ γ, β

Figure 3: The 2-pcycles L2 (solid edges only) and L2

(solid and dotted edges) for production cycle LR from Figure 1.

question that arises is: what is the actual target process that a production plant needs to support?

Unlike in AI behavior composition (e.g., [De Giacomo et al., 2013]), the desired manufacturing process to be implemented in the production plant is neither arbitrary nor given as part of the input. Intuitively, the desired process involves the production of multiple items of a given product type, all conforming to the production cycle induced by the production recipe specifying how to manufacture items of the product type. We shall call such a process the target production process (TPP), or just the target. However it is not clear exactly what this process ought to be, let alone how to compute it, so we ﬁrst spell out the requirements. The ﬁrst requirements is:

R1. A TPP should allow the manufacture of multiple product

items simultaneously. When a TPP allows the manufacture of exactly n items in the plant at a given time, we say it is an n-TPP.

For example, while a shirt is being dyed by one production resource, the fabric of another shirt another product instance is being cut by another resource. One way of meeting R1 would be to deﬁne the production process to be the synchronous execution of multiple copies of the product production cycle, formally captured as the synchronous execution of n production cycles L or L (see Figure 3):

Deﬁnition 5. Given a production cycle L = (L,l0,A,λ) and n 1, the n-pcycle of L is a tuple Ln = (T,t0,An,δ) where:

T = Ln is the set of states of Ln. When t = l1, ,ln

T, we use sti(t) = li to denote the i-th component of t;

t0 = l0, ,l0 is the initial state of Ln; and

δ T An T is such that t = δ(t,a) (or t a t ) iff

sti(t ) = λ(sti(t),ai) for each i {1,...,n}.

However, taking Ln (or Ln

) as the production process is too demanding: they embed all possible ways in which n product items may be manufactured (including no-op transitions), whereas the production plant may only be able to produce n items concurrently in a particular way. To address this, one could consider extracting the maximal realizable target of Ln

[Yadav and Sardina, 2012; Yadav et al., 2013] to obtain the aspects of Ln

that can be realized in the plant. However the maximal realizable target may restrict some of the production cycles, as the resulting fragments may not account for the complete production cycle. This leads to our second requirement for TPPs:

AA XX AX2 XA2

Figure 4: T1 and T2 are not 2-targets for L as in Figure 1, while T3 is. Dashed edges represent transitions in δUN.

R2. For each product item in a TPP, the complete product

recipe should be available, no matter how the other product items are processed.

That is, when manufacturing each item, it should be possible to produce any variant of the product type (e.g., color of a shirt). In general, which variant a particular item corresponds to is not known at the outset (e.g., whether action β or γ will be performed is known until we reach state X in R; Figure 1). As a result, T1 in Figure 4 is not a 2-TPP for the recipe R, as it is unable to manufacture two β items only (or similarly two γ items): there is no transition β,β from state XX. Note that a naive projection from T1 on either item instance would indeed yield the production cycle of the product more is needed to capture this requirement.

To ensure that TPPs are both general and ﬂexible, we stipulate the following additional requirements:

R3. The possible evolutions of the TPP may depend on

how (nondeterministic) behaviors in the plant happen to evolve, as long as R2 is met.

This means that unlike in AI behavior composition, we shall accept target production processes that are nondeterministic.

R4. A TPP may pause certain items and account for different

ways in which all of them can be concurrently produced (the decision how exactly left to the user at run-time).

That is, a TPP may include several (though not all) alternative ways of manufacturing n items concurrently, including pausing certain items at some point, where this is necessary due to limitations in plant resources. However we restrict the ways that ( pause ) actions may be used:

R5. A TPP should never allow the starvation of any item:

every item must eventually be completed.

For example, T2 in Figure 4 is also not a 2-TPP for R, because the second item instance is commenced but never completely manufactured.

With the ﬁve requirements above, we are now ready to introduce the technical machinery to capture TPPs. We start by deﬁning a notion of fragments of an n-pcycle which is extended to incorporate memory and which differentiates between controllable and uncontrollable transitions.

Deﬁnition 6. Tuple T = (G,g0,An

UN) is an n-product of a production cycle L if (recall Ln

G = T M is the set of T s states, for some arbitrary

ﬁnite set M, and g0 = t0,m with m M is T s initial state. The set M will be used to encode additional memory;

G is an arbitrary transition relation such that if t,c a t ,c in δ , then t a t in δ; and

UN δ is arbitrary subset of transitions deemed uncontrollable and with the only requirement that if t,c a t ,c δ

UN, then there exists another transition t,c a t ,c δ

UN such that c c .

An n-product can therefore account for only some transitions of the corresponding n-pcycle (which means not every interleaving of the underlying production cycles are allowed), while extending it with additional memory (the set M) and distinguishing a subset of transitions as uncontrollable .

n-products constrained to adhere to requirements R1-R5 form then the basis of our deﬁnition of TPPs. In particular, we need to make sure that by pausing a product item we do not violate either R2 or R5. Intuitively, this amounts to checking whether, whenever an item instance i is paused at a state g of TPP T , all actions available for i in g will be eventually available, i.e., manufacture of the item can be fully resumed. We do so in two steps. First, for any action a and instance i, we deﬁne all possible acyclic traces of T in which we either (i) resume the item (after a sequence) by performing exactly a ﬁrst; or (ii) not resume the item at all. This step is captured by the deﬁnition of a closure tree. Second, we check whether we are able to enforce one of the traces for (i), irrespective of the uncontrollability represented by transitions in δUN. This step is captured by the deﬁnition of controllable subsets.

Given a n-product T as above and a state g G, the closure tree in T for a component i {1,...,n} and an action a A, denoted out T (g,i,a), is the set of all traces = g0a1g1a2 a g of T , with t0 = t, such that:

i = for all 1 j < , and either a

2. no transition is taken more than once, i.e., for each gj

and gq, if gj = gq then aj+1 aq+1 for 1 j,q < ;

3. for each 1 j < , if there exists a transition gj,a,g

δ for some g and a with ai = a, then aj+1

Informally, the closure tree out T (g,i,a) contains all possible ﬁnite traces of T , starting from g, such that (i) either it is never possible to execute a on the i-th component or only noop actions are allowed until a is executed; (ii) they are acyclic; and (iii) whenever a is executable on the i-th item instance at some step j of , then the next action vector aj+1 has a on the i-th component.

Deﬁnition 7. Given a set of traces of an n-product T , the set of controllable subsets of is the set contr( ) of subsets where, for each trace = g0a1 a g in , if for some 1 j we have gj 1,aj,gj δUN, then there is in also a trace = g0a1 ajg j for every g j gj with gj 1,aj,g j δUN (at least one exists by deﬁnition of δUN).

Informally, a set of traces contr( ) is closed under uncontrollability of transitions: there is no trace with an uncontrollable transition at some step, unless all possible evolutions also appear in some trace.

Putting it all together, we use controllable subsets of the closure tree to verify that items paused in an n-product T can always be resumed, so satisfying R2 and R5. For any i, g and a, given the (unique) set of traces out T (g,i,a), then an element of contr(out T (g,i,a)) is a subset of out T (g,i,a) such that, by controlling only controllable transitions, we can force the resulting path to be in the subset, irrespective of the nondeterminism of δUN.

Finally, we say that a vector b An

is subsumed by an action vector a An iff bi = ai or bi = , for each 1 i n.

We now have all the elements necessary to deﬁne a target production process, i.e., the process we would like to deploy in the production plant.

Deﬁnition 8. An n-product T = (G,g0,An

UN) is an n target production process (n-TPP) for a production cycle L (of a given product recipe R) if: (here Ln = (T,t0,An,δ))

T is an n-product of L; and

T is complete and fair wrt Ln, that is, for any state

t,m G and each transition t a t in Ln:

1. t,m b t ,m exists with b subsumed by a; 2. for each i {1,...,n}, if bi = then there ex-

ists contr(out T ( t ,m ,i,ai)) such that, for any g0 = t ,m as above and for any trace = g0c1g1c2 c g in we have c

i = ai: whenever an action ai is not replicated ( is executed instead), then it is still possible to enforce a trace in T in which ai can be ﬁnally executed.

Hence, an n-TPP is a system whose behavior is a subset of the behaviors of Ln

, but with possibly additional (bounded) memory. Note we do not require an n-TPP to be the largest such system (see later), only that it satisﬁes R1-R5.

As an example, consider T3 in Figure 4. It is easy to verify that it is indeed a 2-TPP of L, since for each trace Ln there exists a trace performing the same actions on both product instances, and which can be extended to account for each possible future evolution of . For instance, we can perform β (resp. γ) actions on both instances, by pausing one at a time. Of course, other 2-TPPs for L exist.

4.1 Compositions and Production Controllers Clearly, not any TPP is satisfactory for a given production plant. For instance, TPPs that cannot be realized (i.e., implemented) in the plant or that never take advantage of parallelism implicit in the plant s production resources, are not desired. Moreover, the TPP is not a problem input.

From now on, we will restrict ourselves to m-TPPs, where m is the number of behaviors (i.e., available machines) in S, as TPPs with more than m concurrent items will obviously never be implementable in a plant with m machines.

Following [De Giacomo and Sardina, 2007; De Giacomo et al., 2013] we ﬁrst formally deﬁne what it means for a plant controller P to realize a given trace of an m-TPP

T in a production plant system S: P is able to delegate each action in the vector (including actions), at each step, to a suitable machine in the plant. For example, consider the trace = AA , XX ,β XA of the TPP T3 from Figure 4 and plant S from Figure 2. Then is realized in S by any controller P such that P(00, , ) = 1,2 and P(00 , 11 ,β ) = 2,1 , corresponding to one of the two traces 00 , 1,1 11 ,β 2,1 01 and 00 , 1,1 11 ,β 2,1 11 in S. Note, however, that while these traces can be extended in T3 with the action vector β, , this action vector can be replicated in S only from 11 and not from 01, so any trace of the form β, is not realizable. Then, again as in classical behavior composition, we say that P is a plant composition of a TPP T in S iff P realizes all the traces of T in S. When a composition for a TPP T exists on a plant S, we say that T is realizable in S.

Finally, because, unlike in behavior composition, the target production process is not an input to our parallel composition problem, we deﬁne the solution concept as a pair T ,P , such that T is an m-TTP for L and P a composition for T in S. We refer to these pairs as production controllers for L in S.

5 Largest Target Production Process Production controllers may be evaluated against various metrics (such as average throughput, machine utilisation, load balancing, etc.) to isolate the most efﬁcient ones. As metrics are highly dependent on the application and production plant, in this section we focus on computing production controllers T ,P where T is the largest realizable production process, that is, which embody (i.e., can mimic ) all realizable production processes. It turns out such controllers can be deﬁned in terms of the standard notion of simulation between transition systems [Milner, 1971], and in particular its nondeterministic variant [De Giacomo et al., 2013]: a target production process T simulates another target production process T , denoted T T , iff T can replicate every action of T , step by step. The largest realizable production process for a given production cycle L and a production plant system S is threfore the m-TPP T for L that (i) is realizable in S; and (ii) can simulate any other realizable m-TPP T .

To capture property (i), we shall deﬁne our own notion of simulation between the production plant S and the mpcycle Lm, which expresses what it means for S to be able to mimic exactly m copies of the production cycle. Moreover, in order to synthesize the largest realizable m-TTP, we will also need to compute, as required by Deﬁnition 6, the set M encoding arbitrary memory and the set of uncontrollable transitions δUN. Crucially, both of these can be extracted from the so-called enacted system, capturing the joint execution of Lm

with the system S.

Given a production cycle L, we deﬁne the enacted system S

L as the synchronous product of Lm

and the production plant system S, so that a state of S

L is a pair t,s , where t a state of Lm

and s is a state of S. If is the transition relation of S

L, then we use

ND to denote the set of nondeterministic transitions of : s a, k s is in

ND iff s a, k s

and s a, k s are in , for some s s . We also de-

ﬁne the notion of closure tree out S

L(s ,i,a) for S

L as for n-product (see Section 4), and contr(out S

L(s ,i,a)) is as in Deﬁnition 7, but this time with respect to transitions in

ND. A subset in contr(out S

L(s ,i,a)) is thus a set of traces of the enacted system S

L which can be enforced irrespective of the nondeterminism of the production plant system. Deﬁnition 9. Given an m-pcycle Lm = (T,t0,Am,δ) for some production cycle L, a lazy simulation relation of Lm by S = (S,s0,Am

, ) is a relation σ+ S T such that s,t σ+ implies that for each a, if t a t is in Lm then b,k exist such that:

(1) there exists a transition s b k s and b is subsumed by a each ai is either replicated in bi or replaced by ;

(2) for each such s we have s ,t σ+ with t b t in Lm

any possible new system state s is in the relation with the state t , which is unique; and (3) given t , s and b as above, for each index 1 i

n, if bi = then there exists a set of traces contr(out S

L(s ,i,ai)), for s = t ,s , such that for

every trace s c1k1

L in , we have (i) c

i = ai and (ii) sj,tj σ+ for j the action ai can always eventually be performed in at least one controllable subset of traces in the enacted system, by visiting only states preserving σ+.

We say that a state s of S lazily simulates a state t of Lm, denoted s t, iff there exists a lazy simulation relation σ+ of Lm by S with s,t σ+. We say that S lazily simulates Lm, denoted S Lm, iff s0 t0. Then S Lm captures the fact that S can always replicate all actions vectors of Lm, but not in a step-by-step fashion, as some product instances may be paused. However, it guarantees the requirements discussed in Section 4.

Once the lazy simulation relation is computed, if S Lm, we can then build a ﬁnite state program, called the maximal controller generator (CG), that returns, at each step, the set of all possible action vectors that can be delegated to behaviors. Deﬁnition 10. Given S, Lm and Lm

as before, if S Lm then the maximal CG for Lm and S is the tuple C = (W,w0,Am

,γ,!) such that W = { t,s T S s t} and w0 = t0,s0 ;

γ W Am indx W is such that t,s a k t ,s γ iff t a t is in Lm

, s a k s is in S; and

indx) is the controller function, deﬁned as !(w) = { a,k w a k w is in γ for some w }.

Given the current state t of Lm and the current state s of the production plant system, the maximal controller generator returns the set of possible delegations, such that each member a,k represents the delegation of each action ai to the behavior speciﬁed by ki. A production controller T ,P is generated by C as follows:

T = (W,w0,Am

UN) is an m-TTP, and its transition relation γ W Am

W is such that w a w is in γ iff a,k !(w);

UN is the subset of γ deﬁned as the set of transitions w a w in γ such that w a w is in γ for some w w : multiple transitions deriving from the existence of different available delegations of actions to behaviors (i.e. same a, different k) constitute controllable transitions, whereas multiple transitions deriving from the production plant system s nondeterminism (i.e. same a and k) will constitute uncontrollable transitions;

P is such that for any system history h = s0 a1k0

a k s (with s0 = s0) and action vector a, if s0,t0

a1k1 s1,t1 a2k2

s ,t is in C for some t1 t , then P(h,a) = k only if a,k !( s ,t ). T is obtained by projecting out delegation indexes from transitions, while the corresponding composition P is obtained by selecting, at each step, for a given action a, a vector k such that a transition from the current state exists with label a,k. Note that this means that the second component in W T S plays the same role of the arbitrary set M in Deﬁnition 6: for the same state in Lm, there may be more than one possible state in the production plant system S, as this is nondeterministic (e.g., states 11,AX and 01,AX in Figure 5). Theorem 1. Let C be the maximal CG for Lm and S. Then (i) any P is a composition for T in S iff T ,P is generated by C and (ii) T is the largest realizable m-TPP for L in S. Proof. (Sketch) Let H ,P be the set of histories of S that may be obtained by running the controller P in S to match all the actions in a given trace of an m-TTP T (see [De Giacomo and Sardina, 2007; De Giacomo et al., 2013]).

(i) ( ) P is a composition for T , i.e. it realises any trace = t0 a1

(recall t0 = t0). First, we show that for any history h = s0 a0k0

s in H ,P (s0 = s0) we have sj tj, for any 1 j (for j = 0 this follows by Deﬁnition 10). By induction on , h +1 H +1

,P is such that s a +1k +1

is in S by deﬁnition, so t ,s a +1k +1

t +1,s +1 is in C, P(h ,a) = k +1, and from s t it follows that s +1 t +1. Then, for any and h H ,P as above with h < , by the deﬁnition of for any action a with t a +1

t +1 in T , we have s a +1k s +1 in S. Further, assume T is not a m-TPP for L. Then by Deﬁnition 8, for some trace t0,s0 b1k1

t ,s we have t a t in Lm such that either (1) no t ,s b +1

t +1,s +1 exists s.t. b +1 is subsumed by a, or (2) there exists i s.t. bi = and in every set in contr(out T ( t +1,s +1 ,i,ai)) there is a trace t +1,s +1 b +2

t +q,s +q and b +q ai. Then it is easy to see how this implies the same for every set in contr(out S

L( t +1,s +1 ,i,ai)). Thus we exclude both (1) and (2) as they contradict s t .

(i) ( ) We prove that for any trace t0 a1

t of T and history h = s0 a1k1

,P , if P is a composition for T then t0,s0 b1k1

t ,s a k t,s is in C for k = P(h,a). If not, then a,k !( t ,s ), and, by deﬁnition, either there is no s a k s +1 in S, no t a

Figure 5: A graphical representation of the maximal CG for the production cycle LR in Figure 1 and the production plant system S in Figure 2. A generated production controller T ,P is as follows: T is obtained by removing indexes k from transitions (it is thus unique); given a system history h and an action vector a, P(h,a) = k only if a transition labelled with a,k exists from last(h) (hence many P exist). Solid and dashed lines represent controllable and uncontrollable transitions of T , respectively, and n denotes no-op actions. For instance, by delegating the action vector β, to behaviors 1,2 from state 11,XX (namely β to B1 and to B2), the next state can either be 11,AX or 01,AX , depending on the successor state reached by B1 (see Figure 2). On the other hand, the two transitions labelled with , from 01,AA are controllable: they represent two distinct control choices, to delegate action to B2 and to pause B1 or vice-versa, respectively.

or s +1 t +1. In the ﬁrst two cases P is not a composition. If the latter holds, it must be the case that in any trace s +1,t +1 a +1k +1

condition (3) of Deﬁnition 9 does not hold, hence T ,P is not a production controller generated by C.

(ii) If T T exists, then there exists a trace such that last( ) b t is in T for some t, but last( ) b t is not in T for any t . Since T is realizable, there exists a composition P s.t. h H ,P where P (h,b) is deﬁned, and P(h,b) is not. So there is no trace t0,s0 b1k1

in C, with s0 b1k1

s = h and t0 b1

t = , s.t. t ,s b k t,s for some s and k, and t as above. Then by construction either s t or s t.

Computing the lazy simulation relation between S and Lm can be done by following the algorithm for the standard simulation relation in [De Giacomo et al., 2013], but checking the controllable closure tree at each step (Deﬁnition 9). An algorithm can be derived accordingly. Performance can be improved by reordering actions a according to indexes k in S so as to substantially reduce the number of transitions.

6 Conclusions

In this paper we consider the parallel behavior composition problem in a manufacturing setting, where many instances of a product are to be manufactured on a production line. We introduced a novel solution concept, target production pro-

cesses, and showed how to generate the largest realizable TPP for a given production plant.

There are subtle conceptual and technical differences between the parallel behavior problem and classical behavior composition in the AI literature. In particular, multiple concurrent actions must be delegated to the available behaviors in the plant, rather than just one. We note that this is not the same as multiple behavior composition [Sardina and De Giacomo, 2008], in which several target modules are realized in the same shared available system. Multiple behavior composition is equivalent to realizing Lm

, which we argue is overly demanding. Moreover, the target desired module is not given as an input to the problem, but is part of the solution constructed from the speciﬁed production recipe and plant.

Our work is just the ﬁrst step in manufacturing composition. We have deﬁned the problem, and provided a notion of adequacy for solutions in the form of TPPs that respect requirements R1-R5. Further work is needed in order to reﬁne m-TPPs to efﬁcient manufacturing processes, for example, with respect to average throughput, machine utilization, load balancing, etc. Such optimizations can be done after the TPP has been built, or possibly during synthesis by discriminating between controllers from individual lazy simulation relations σj

+. Another avenue we are interested in pursuing, is linking our parallel composition problem with the composition of high-level programs [Sardina and De Giacomo, 2009], at least from a representational perspective.

Acknowledgments We thank the anonymous reviewers for their helpful feedback. We acknowledge the support of the Australian Research Council (under DP120100332) and the RMIT Foundation (under an International Visiting Fellowship for the second author to visit RMIT University).

References [Calvanese et al., 2008] Diego Calvanese, Giuseppe De Gia-

como, Maurizio Lenzerini, Massimo Mecella, and Fabio Patrizi. Automatic service composition and synthesis: The Roman Model. IEEE Data Engineering Bulletin, 31(3):18 22, 2008. [De Giacomo and Sardina, 2007] Giuseppe De Giacomo and

Sebastian Sardina. Automatic synthesis of new behaviors from a library of available behaviors. In Proceedings of the International Joint Conference on Artiﬁcial Intelligence (IJCAI), pages 1866 1871, 2007. [De Giacomo et al., 2013] Giuseppe De Giacomo, Fabio Pa-

trizi, and Sebastian Sardina. Automatic behavior composition synthesis. Artiﬁcial Intelligence, 196:106 142, 2013. [Foresight, 2013] The future of manufacturing: A new era of

opportunity and challenge for the UK. The Government Ofﬁce for Science, London, 2013. Ref: BIS/13/810. [Lustig and Vardi, 2009] Yoad Lustig and Moshe Y. Vardi.

Synthesis from component libraries. In Proceedings of the International Conference on Foundations of Software Science and Computation Structures (Fo SSa CS), pages 395 409, 2009. [Milner, 1971] Robin Milner. An algebraic deﬁnition of sim-

ulation between programs. Technical report, Stanford University, Stanford, CA, USA, 1971. [Sardina and De Giacomo, 2008] Sebastian Sardina and Giuseppe De Giacomo. Realizing multiple autonomous agents through scheduling of shared devices. In Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS), pages 304 312, 2008. [Sardina and De Giacomo, 2009] Sebastian Sardina and Giuseppe De Giacomo. Composition of Con Golog programs. In Proceedings of the International Joint Conference on Artiﬁcial Intelligence (IJCAI), pages 904 910, 2009. [Stroeder and Pagnucco, 2009] Thomas Stroeder and Mau-

rice Pagnucco. Realising deterministic behaviour from multiple non-deterministic behaviours. In Proceedings of the International Joint Conference on Artiﬁcial Intelligence (IJCAI), pages 936 941, 2009. [Yadav and Sardina, 2012] Nitin Yadav and Sebastian Sar-

dina. Qualitative approximate behavior composition. In Proceedings of the European Conference on Logics in Artiﬁcial Intelligence (JELIA), volume 7519 of Lecture Notes in Computer Science (LNCS), pages 450 462. Springer, 2012. [Yadav et al., 2013] Nitin Yadav, Paolo Felli, Giuseppe De

Giacomo, and Sebastian Sardi na. Supremal realizability of

behaviors with uncontrollable exogenous events. In Proceedings of the International Joint Conference on Artiﬁcial Intelligence (IJCAI), pages 1176 1182, 2013.