# ltl__91cc99b9.pdf LTLf and LDLf Synthesis under Partial Observability Giuseppe De Giacomo Sapienza Universit a di Roma Roma, Italy degiacomo@dis.uniroma1.it Moshe Y. Vardi Rice University, Houston, TX, USA vardi@cs.rice.edu In this paper, we study synthesis under partial observability for logical specifications over finite traces expressed in LTLf/LDLf. This form of synthesis can be seen as a generalization of planning under partial observability in nondeterministic domains, which is known to be 2EXPTIMEcomplete. We start by showing that the usual belief-state construction used in planning under partial observability works also for general LTLf/LDLf synthesis, though with a jump in computational complexity from 2EXPTIME to 3EXPTIME. Then we show that the belief-state construction can be avoided in favor of a direct automata construction which exploits projection to hide unobservable propositions. This allow us to prove that the problem remains 2EXPTIME-complete. The new synthesis technique proposed is effective and readily implementable. 1 Introduction LTLf, linear temporal logic on finite traces, has been extensively used in AI [Bacchus and Kabanza, 2000; Gerevini et al., 2009; De Giacomo and Vardi, 2013], as well as in other areas of CS such as in business process modeling [Pesic and van der Aalst, 2006; Sun et al., 2012; De Giacomo et al., 2014]. In [De Giacomo and Vardi, 2013], LTLf has been extended to LDLf, linear dynamic logic over finite traces, which fully captures monadic secondorder logic on finite traces (vs. first-order logic captured by LTLf). LDLf includes regular expressions in the temporal operators, which can be used to express, e.g., procedural constraints on executions. Interestingly, reasoning in LTLf/LDLf can be done through manipulation of finite state automata (on finite words). In [De Giacomo and Vardi, 2015] LTLf/LDLf synthesis under full observability has been studied. This is a general form of adversarial synthesis, related to the classical Church realizability problem [Church, 1963; Vardi, 1996], that has been thoroughly investigated in the infinite setting, starting from [Pnueli and Rosner, 1989]. From an AI perspective LTLf/LDLf synthesis is a generalized form of conditional planning under full observability in nondeterministic domains, where the agent controls the choice of ac- tions, while the environment control their nondeterministic effect (notice the devilish, or adversarial, nature of such nondeterminism) [Rintanen, 2004]. The worst case complexity of LTLf/LDLf synthesis under full observability is 2EXPTIMEcomplete [De Giacomo and Vardi, 2015], and it lowers to EXPTIME when we do not mix angelic nondeterminism in the specification (analogous to the ability of guessing the right path towards a final state of an NFA, which makes the specification more succinct) with the devilish nondeterminism due to the adversarial choice. This is exactly what happens in conditional planning under full observability, in which the specification (the planning domain, initial state and reachability goal) does not generate angelic nondeterminism, and the only nondeterminism is the devilish one corresponding to the adversarial choice by the environment of the effects of the action selected by the agent. Indeed conditional planning under full observability is EXPTIME-complete [Rintanen, 2004]. In this paper we study LTLf/LDLf synthesis under partial observability, which in turn is a generalized form of conditional planning under partial observability, known to be 2EXPTIME-complete [Rintanen, 2004]. In particular, we consider propositions partitioned into two sets: the first under the control of the agent and the second under the control of the environment. The key point is that only some of the environment variables are observable. The specification consists of an LTLf/LDLf formula (typically expressed as a conjunction of a finite set of formulas), which expresses how environment s and agent s propositions should jointly evolve over time. The problem of interest is checking whether there exist strategies for the agent to set the controllable propositions over time, depending only on the history of the observable environment propositions, so that regardless of the values assumed by the unobservable propositions, the LTLf/LDLf formula is fulfilled. If such strategies exist, it is of interest to compute one. To set the analogy with planning, consider actions (represented as propositions) as the agent s controllable propositions and fluents as the environment s ones. Only some fluents are observable. Then strategies above are generalizations of conditional plans under partial observability. Inspired by the analogy with planning, we first look into the usual belief-state construction [Goldman and Boddy, 1996], which is (sometimes implicitly) at the base of most results in planning under partial observability [Geffner and Bonet, 2013; Bonet and Geffner, 2000; Hoffmann and Braf- Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) man, 2005; Bertoli et al., 2006; Bryce et al., 2006; Albore et al., 2009; Maliah et al., 2014]. We show that also for LTLf/LDLf synthesis such construction works. Unfortunately, it yields a 3EXPTIME technique, which reduces to 2EXPTIME only when the specification does not induce forms of angelic nondeterminism, as in the case of conditional planning under partial observability. We then show that we can avoid the belief-state construction altogether in favor of a direct automata construction based on projecting out the unobservable propositions and complementation. This new technique gives us a 2EXPTIME upper-bound for LTLf/LDLf synthesis under partial observability, which matches the 2EXPTIME-hardness of conditional planning under partial observability [Rintanen, 2004]. Hence we get that, at least from the worst-case complexity point of view, the generalization from conditional planning to full LTLf/LDLf synthesis under partial observability is for free. The new technique proposed is easily implementable with standard automata operations, and works for a variety of linear time specifications formalisms, ranging from explicit DFA s to arbitrary LTLf/LDLf formulas. 2 LTLf and LDLf LTL on finite traces, or LTLf, has essentially the same syntax as LTL on infinite traces [Pnueli, 1977], namely, given a set P of propositional symbols, LTLf formulas ' are as follows: ' ::= φ | ' | '1 '2 | ' | '1 U '2 where φ is a propositional formula over P, is the next operator and U is the until operator. We use the usual abbreviations such as '1_'2 .= ( '1 '2); eventually as ' .= true U '; always as ' .= '; and ' .= '. (Note that on finite traces ' 6 '.) LTLf is as expressive as FO (first order logic) over finite traces and star-free regular expressions, so strictly less expressive than regular expressions, which in turn are as expressive as MSO (monadic second order logic) over finite traces. On the other hand, regular expressions are not convenient for expressing temporal specifications, since, for example, they miss direct constructs for negation and for conjunction. For this reason [De Giacomo and Vardi, 2013] introduced1 LDLf (linear dynamic logic on finite traces), which merges LTLf with regular expressions through the syntax of the wellknown logic of programs PDL, propositional dynamic logic [Fischer and Ladner, 1979; Harel et al., 2000], but adopting a semantics based on finite traces. Formally, LDLf formulas ' are built as follows: ' ::= φ | ' | '1 '2 | h i' ::= φ | '? | 1 + 2 | 1; 2 | where φ is a propositional formula over P; denotes path expressions, which are regular expressions over propositional formulas φ with the addition of the test construct '? typical of PDL. As in PDL, we use the abbreviation [ ]' .= h i '. Intuitively, h i' states that, from the current step in the trace, there exists an execution satisfying the regular expression such that its last step satisfies ', while [ ]' states that, 1An adaptation of LDL interpreted over infinite traces, in turn introduced in [Vardi, 2011]. from the current step, all executions satisfying the regular expression are such that their last step satisfies '. Tests are used to insert into the execution path checks for satisfaction of additional LDLf formulas. LDLf is indeed as expressive as MSO over finite words, and easily captures LTLf by seeing next and until as abbreviations: ' .= htruei' and '1 U '2 .= h('1?; true) i'2. The semantics of LDLf is given in terms of finite traces, i.e., finite words, denoting a finite, nonempty, sequence of consecutive steps over the alphabet 2P. We use the notation length( ) and (i), and in addition we denote by (i, j) the segment of the trace starting at the i-th step and ending at the j-th step. If j > length( ), we get the segment from the i-th step to the end. The satisfaction relation is defined by simultaneous induction on formulas and path expressions as follows: given a finite trace , an LDLf formula ' is true at a step i, with 1 i length( ), in symbols , i |= ', if: , i |= φ iff (i) |= φ (φ propositional); , i |= ' iff , i 6|= '; , i |= '1 '2 iff , i |= '1 and , i |= '2; , i |= h i' iff there exists i j length( ) such that (i, j) 2 L( ) and , j |= '; where the relation (i, j) 2 L( ) is as follows: (i, j) 2 L(φ) if j=i+1, j length( ), and , i |= φ (φ propositional); (i, j) 2 L('?) if j = i and , i |= '; (i, j) 2 L( 1 + 2) if (i, j) 2 L( 1) or (i, j) 2 L( 2); (i, j) 2 L( 1; 2) if there exists k, with i k j, such that (i, k) 2 L( 1) and (k, j) 2 L( 2); (i, j) 2 L( ) if j = i or there exists k, with i k j, such that (i, k) 2 L( ) and (k, j) 2 L( ). We say that a trace satisfies an LTLf/LDLf formula ', written |= ', if , 1 |= '. Also, sometimes we denote by L(') the set of traces that satisfy ': L(') = { | |= '}. We can associate with LDLf formulas ' a (possibly exponentially larger) NFA A' that accepts exactly the traces that satisfy '. To do so (i) we translate ' into alternating automaton on words (AFW) (whose number of states is polynomial in the size of ') that accepts exactly the traces that satisfy ', and then (ii) transform the AFW into a NFA [De Giacomo and Vardi, 2013] (whose number of states is exponential in that of the AFW, and hence in the size of ' this exponential blowup is unavoidable). These two steps can be combined into a simple direct algorithm for computing the NFA corresponding to an LDLf formula [De Giacomo and Vardi, 2015]. LTLf/LDLf synthesis under full observability. Synthesis under full observability [Vardi, 1996; Pnueli and Rosner, 1989] has been studied for LTLf/LDLf in [De Giacomo and Vardi, 2015]. Essentially it is as follows. We partition the set P of propositions into two disjoint sets X and Y. We assume to have no control on the truth value of the propositions in X, while we can control those in Y. The problem of interest is: can we control the values of Y in such a way that for all possible values of X the LTLf/LDLf specification ' remains true? More precisely, traces now assume the form = (X0, Y0)(X1, Y1)(X2, Y2) (Xn, Yn), where (Xi, Yi) is the propositional interpretation at the i-th position in , now partitioned in the propositional interpretation Xi for X and Yi for Y. Let us denote by X |i the trace projected only on X and truncated at the i-th element (included), i.e., X |i = X0X1 Xi. Synthesis consists in computing a function f : (2X ) ! 2Y such that for all with Yi = f( X |i), we have that satisfies the formula '. Realizability is the corresponding recognition problem, i.e., checking that such a function exists. As usual we will blur the distinction between the two and talk about complexity of synthesis, to intend the complexity of the associated recognition problem, i.e., realizability. Observe that in synthesis we have no way of constraining the value assumed by the propositions in X: the function we are looking for only acts on propositions in Y. DFA games. The basic technique introduced in [De Giacomo and Vardi, 2015] to solve synthesis under full observability is to reduce the LTLf/LDLf specification to a DFA automata and then play over it the so called DFA game. DFA games are games between the agent and the environment. A round of the game consists of both the agent and the environment setting the values of the propositions they control. A (complete) play is a word in (2X 2Y) describing how the agent and environment set their propositions at each round till the game stops. The specification of the game is given by a DFA G of the form G = (2X 2Y, S, s0, δ, F), where: 2X 2Y is the alphabet of the game; S are the states of the game; s0 is the initial state of the game; δ : S 2X 2Y ! S is the transition function of the game: given the current state s and a choice of propositions X and Y for the enviroment and the agent, δ(s, (X, Y )) = s0 is the resulting state of the game; F are the final states of the game, where the game can be considered terminated. A play is winning for the agent if such a play leads from the initial to a final state. A strategy for the agent is a function f : (2X ) ! 2Y that, given a history of choices from the environment, decides which propositions Y to set to true/false next. A winning strategy is a strategy f : (2X ) ! 2Y such that for all with Yi = f( X |i) we have that leads to a final state of G. Notice that in the DFA game, in spite of the name, we do have devilish nondeterminism, corresponding to uncontrollability of the propositions X. What has been removed is the angelic nondeterminism coming from the LTLf/LDLf specification, which can be translated (in exponential time) into an NFA, which, if not already a DFA, requires a further determination (another exponential step). To better grasp the game from a planning point of view, consider X to be fluents and Y to be actions. Then the DFA game is akin to an explicit state representation of a conditional planning under full observability problem, with the final states of the DFA game acting as the goal. To actually compute the strategy, we start by defining the controllable preimage Pre C(E) of a set E of states of G as the set of states s such that there exists a choice of values for propositions Y such that for all choices of values for propositions X, game G progresses to states in E. Formally: Pre C(E) = {s 2 S | exists Y 2 2Y such that for all X 2 2X we have δ(s, (X, Y )) 2 E} Then, we define the set Win(G) of winning states of the DFA game G, i.e., the set formed by the states from which the agent can win G, as a least-fixpoint, making use of approximates Wini(G) denoting all states where the controller wins in at most i steps: Win0(G) = F (the final states of G); Wini+1(G) = Wini(G) [ Pre C(Wini(G)). Then, Win(G) = S i Wini(G). Notice that computing Win(G) requires linear time in the number of states in G. Indeed, after at most a linear number of steps Wini+1(G) = Wini(G) = Win(G). A DFA game G admits a winning strategy iff s0 2 Win(G). Then, we define a strategy generator based on the winning sets Wini(G). This is a nondeterministic transducer, where nondeterminism is of the kind don t-care : all nondeterministic choices are equally good. The strategy generator TG = (2X 2Y, S, s0, %, !) is as follows: 2X 2Y is the alphabet of the trasducer; S are the states of the trasducer; s0 is the initial state; % : S 2X ! 2S is the transition function such that %(s, X) = {s0 | s0 = δ(s, (X, Y )) and Y 2 !(s)}; ! : S ! 2Y is the output function such that !(s) = {Y | if s 2 Wini+1(G) Wini(G) then 8X.δ(s, (X, Y )) 2 Wini(G)}. The transducer TG generates strategies in the following sense: for every way of restricting !(s) to return only one of its values (chosen arbitrarily), we get a strategy. To obtain the DFA game given an LTLf/LDLf specification we perform some preprocessing, to get from a compact representation in logic to an explicit representation in terms of game states, and to get rid of the angelic nondeterminism. As a result, the complexity of LTLf/LDLf synthesis is 2EXPTIME-complete in general and EXPTIME-complete in the case of LTLf/LDLf specifications that do not give rise to angelic nondeterminism, as for conditional planning. 3 Synthesis Under Partial Observability Now, we introduce synthesis for LTLf/LDLf specification under partial observability. As before, we partition the set P of propositions into two disjoint sets X and Y. As before, we assume to have no control on the truth value of the propositions in X, while we can control those in Y. In addition, we assume that only a subset Obs of the uncontrollable propositions X are actually observable. The synthesis problem intuitively is: can we choose suitably the values of Y on the basis of what observed so far in such a way that for all possible values of X the LTLf/LDLf specification ' remains true? More precisely, as before, traces have the form = (X0, Y0)(X1, Y1)(X2, Y2) (Xn, Yn), where (Xi, Yi) is the propositional interpretation at the i-th position in , partitioned in the propositional interpretation Xi for X and Yi for Y. But now, given a propositional interpretation Xi for X, we denote by obs(Xi) the projection of Xi on the propositions in Obs. Given a trace = (X0, Y0)(X1, Y1)(X2, Y2) (Xn, Yn), we define the corresponding observable trace ! as: obs( ) = (obs(X1), Y1)(obs(X2), Y2) (obs(Xn), Yn). Vice versa, given an observable trace ! we define the set full(!), denoting all traces giving rise to the same observable trace !, as: full(!) = { | obs( ) = !} As before, we denote by X |i the trace projected only on X and truncated at the i-th element (included), i.e., X |i = X0X1 Xi. Similarly, we denote by Obs|i the observable trace obs( ) projected only on Obs and truncated at the i-th element. The synthesis problem under partial observability consists in computing a partial function f : (2Obs) ! 2Y such that for all with Yi = f( Obs|i) (note that this implies that it is defined), we have that satisfies formula '. To solve the synthesis problem, we can try to proceed as in the case of full observability. We can reduce our LTLf/LDLf specification to a DFA A over the alphabet 2X 2Y. Over such DFA, however this time, we need to play the DFA game under partial observability. A round of the game consists of both the agent and the environment setting the values of the propositions they control. A (complete) play is a word in (2X 2Y) describing how the agent and the environment set their propositions at each round till the game stops. A play is winning for the agent if such a play leads from the initial to a final state. A partial-observability-strategy for the agent is a partial function f A : (2Obs) ! 2Y that, given a history of choices from the environment, decides which propositions Y to set to true/false next based only on the observable history of the environment. A winning partialobservability-strategy is a strategy f A : (2Obs) ! 2Y such that for all with Yi = f A( Obs|i) we have that leads to a final state of the DFA A. The synthesis problem consists of computing a winning strategy. We show below that one possible way to solve DFA games under partial observability is to use the belief-state construction typically used in conditional planning under partial observability [Goldman and Boddy, 1996; Rintanen, 2004; Bertoli et al., 2006]. 4 Belief-States Construction We define belief-states DFA game GObs A , or simply G, associated with a DFA game A = (2X 2Y, S, s0, δ, F) as the following DFA game: G = (2X 2Y, B, B0, @, F), where: 2X 2Y is the alphabet of the partial observability game; B = 2S are the states of the partial observability game, which are called belief states, and correspond to sets of the states of the original game [Goldman and Boddy, 1996]; B0 = {s0} is the initial state of the partial observability game, which corresponds to that of the original game; @ : B 2X 2Y ! B is the transition function of the game: given the current state B and a choice of propositions X and Y , respectively for the enviroment and the agent, the transition function is defined as: @(B, (X, Y )) = {s0 | exists s, X0 s.t. s 2 B and obs(X0)=obs(X) and δ(s, (X0, Y )) = s0} F = 2F are the final states of the game, where the game can be considered terminated. Observe that for X1 and X2 such that obs(X1)=obs(X2), we have @(B, (X2, Y )) = @(B, (X1, Y )), for all B and Y . In other words the partial observability game cannot distinguish between assignments of the uncontrolled propositions that are observationally equivalent. A crucial observation in the finite trace setting is the following: a strategy reaches a final state if and only if it reaches a final belief state. This observation has been first made in the context of conditional planning under partial observability: a plan reaches the goal iff it reaches a belief state where the goal is satisfied, see Theorem 3.5 in [Bertoli et al., 2006]. Notice that G is a standard fully observable DFA game, though in the belief state space. Theorem 1. Let A be DFA game under partial observability, and G the corresponding belief-state DFA game. Then A admits a winning strategy iff G does. Proof (sketch). If-Direction. Form the observation that for X1 and X2 such that obs(X1) = obs(X2), we have @(B, (X2, Y )) = @(B, (X1, Y )), for all B and Y , It is immediate to see that every strategy for G is also a partialobservability-strategy for A. Indeed, f G( X |i) = f G( 0 X |i) as long as Obs|i = 0 Obs|i. Moreover if @(B0, ) 2 F then @(B0, ˆ ) 2 F for every ˆ 2 full(obs( )) and hence for δ(s0, ˆ ) 2 F for every ˆ 2 full(obs( )). Only-if-Direction. Suppose we have a strategy for f A : (2Obs) ! 2Y for A that is winning, i.e., such that for every with Yi = f A( Obs|i) we have that leads to a final state of A. Then f G : (2X ) ! 2Y for G defined as f G( X |i) = f A( Obs|i) is a winning strategy for G since it leads to F. Hence we can concentrate on the belief-state DFA Game G, which is a standard DFA game and use it to solve the corresponding DFA game under partial observability. The construction above requires an exponential blow-up in the original number of states to get the belief-state DFA game, which in turn can be solved polynomially. In fact it can be shown that DFA games with partial information can solve two-player reachability games which are indeed EXPTIMEcomplete [Reif, 1984].2 Hence we get: Theorem 2. Let A be DFA game under partial observability. Then deciding whether A admits a winning strategy is EXPTIME-complete. LTLf /LDLf synthesis via belief-state construction. To do synthesis in LTLf or LDLf, we translate an LTLf/LDLf specification ' into (an AFW and then into) an NFA A'. This is an exponential step. Then, we transform the resulting NFA 2This paper introduces a construction analogous to the beliefstate construction used in planning, see also [Raskin et al., 2007]. into a DFA Ad ', e.g., using the standard determinization algorithm based on the subset construction [Rabin and Scott, 1959]. This costs us another exponential. Then we build the belief-state DFA game GObs ' corresponding to Ad '. This costs us another exponential. At this point we solve the DFA game GObs ' by computing Win(GObs ' ) and the corresponding strategy generator TGObs ' . This is a linear step. Considering the cost of each of the steps above, we get the following worst-case computational complexity upper bound. Theorem 3. Synthesis under partial observability in LTLf/LDLf can be solved in 3EXPTIME. A 2EXPTIME lower-bound comes from the fact that LTLf/LDLf synthesis under full observability is 2EXPTIMEcomplete [De Giacomo and Vardi, 2015], but also by considering that conditional planning under partial observability, which is indeed 2EXPTIME-complete [Rintanen, 2004], is a special case. Theorem 4. Synthesis under partial observability in LTLf/LDLf is 2EXPTIME-hard. It turns out that we can close the above gap between membership and hardness, however we have to give up the beliefstate construction. This is what we do next. 5 Projection-based Construction We now devise a technique to solve synthesis under partial observability which avoids the belief-state construction. In fact, we study such a technique starting from a variety of specifications for the join admissible traces of environment and agent. Namely we consider specifications given as: DFA, which is a case of particular interest since condi- tional planning problems under partial observability can be seen as compactly (logarithmically) represented DFA specifications (notice these allow one to model devilish nondeterminism of nondeterministic planning domains); NFA, which, while possibly hard to imagine as an ac- tual specification for a planning/synthesis problem, is of interest from a technical point of view; AFW, which generalizes both a DFA and NFA and is tightly related to LTLf/LDLf synthesis; LTLf/LDLf formula, which is the problem of interest in this paper. The various procedures above, along with complexity results, are summarized in the table in Figure 1 (In the case of DFA we consider also compact representation which essentially corresponds to conditional planning under partial observability). From DFA specification. We consider first the case in which the specification is given directly as a DFA (over X and Y, not directly over Obs and Y). Let the DFA A be A = (2X 2Y, S, s0, δ, F), where: 2X 2Y is the alphabet of the DFA; S are the states of the DFA; s0 is the initial state of the DFA; δ : S 2X 2Y ! S is the transition function of the DFA: given the current state s and a choice of propositions X and Y , respectively for the enviroment and the agent, δ(s, (X, Y )) = s0 is the resulting state of the DFA; F are the final states of the DFA. Step 1: first complementation. Consider the complement A of A. Since A is a DFA we can obtain its complement A in linear time by simply switching in A the final states with the nonfinal ones: A = (2X 2Y, S, s0, δ, F), where F = S F. The DFA A accepts all joint traces for enviroment and agent that do not satisfy the specification A. Step 2: projection. We project out the propositions that are hidden, getting a new automaton (A) = (2Obs 2Y, S, s0, δ , F) where: δ (s, (O, Y )) = {s0 | δ(s, (X, Y )) = s0 and obs(X)=O} Notice that this automaton (A) accepts an observable trace ! if there exists a full trace 2 full(!) that is accepted by A, i.e., a full trace that violates the specification A. Notice also that (A) is an NFA. Step 3: second complementation. Consider the complement (A) of (A). This is an exponentially larger DFA, obtained through the usual subset construction [Hopcroft et al., 2001], which accepts a trace ! if all full traces in full(!) satisfy the specification A. The DFA (A) represents the specification for a synthesis problem with full observability whose solutions are strategies (depending only on the observable history) which are solutions for the original synthesis problem under partial observability. To solve it we consider (A) as a DFA game under full observability and compute its winning strategies (which require a polynomial fixpoint computation). Thus we get an EXPTIME procedure for synthesis under partial observability from DFA specification. Theorem 5. Synthesis under partial observability from a DFA specification is EXPTIME-complete. Proof (sketch). Membership is obtained by using the procedure above. Hardness follows from EXPTIME-hardness of DFA games under partial observability, see Theorem 2. Note that the technique above, although it avoids the beliefstate construction is analogous to it at Step 3. Indeed, the transition function generated at Step 3 is the result of the subset construction for determinization, which is based on the power set of the original states as the belief-state construction is. It is of particular interest to consider the case where the DFA specification is represented compactly (i.e., logarithmically), as for example in conditional planning problems under partial observability, where the planning domain and the reachability goal are represented compactly, e.g., as logical formulas [Rintanen, 2004]. Then the technique above becomes 2EXPTIME. Theorem 6. Synthesis under partial observability from a DFA specification given in a compact (logarithmic) representation is 2EXPTIME-complete. DFA spec NFA spec AFW spec LTLf/LDLf spec 1. Complement DFA spec A, getting DFA A (poly) 2. Project out unobservable propositions from A, getting NFA (A) (poly) 3. Complement NFA (A), getting DFA (A) (exp) 4. Solve DFA game (poly) 1. Complement NFA spec A, getting DFA A (exp) 2. Project out unobservable propositions from A, getting NFA (A) (poly) 3. Complement NFA (A), getting DFA (A) (exp) 4. Solve DFA game (poly) 0. Complement AFW spec Aafw, getting AFW Aafw (poly) 1. Transform AFW Aafw into NFA, getting NFA A (exp) 2. Project out unobservable propositions from A, getting NFA (A) (poly) 3. Complement NFA (A), getting DFA (A) (exp) 4. Solve DFA game (poly) 0. Compute AFW A Φ for Φ (poly) 1. Transform AFW A Φ into NFA, getting NFA A (exp) 2. Project out unobservable propositions from A, getting NFA (A) (poly) 3. Complement NFA (A), getting DFA (A) (exp) 4. Solve DFA game (poly) EXPTIME-complete (2EXPTIME-complete starting from compact representations) 2EXPTIME 2EXPTIME-complete 2EXPTIME-complete Figure 1: Summary of techniques and results for synthesis under partial observability Proof (sketch). Membership is obtained by using the procedure above. Hardness comes from 2EXPTIME-completeness of conditional planning under partial observability. From NFA specification. If we start from a specification given as an NFA, the procedure above still applies, though in this case the first complementation requires an exponential blow up, thus making the procedure 2EXPTIME. It remains open whether the problem is indeed 2EXPTIME-hard. Indeed, we conjecture it is. Theorem 7. Synthesis under partial observability from an NFA specification is in 2EXPTIME. From AFW specification. If we start from a specification given as AFW then the first complementation can be done in polynomial time, but then we need to transform the resulting AFW into an NFA for projection, and this costs an exponential. So performing the rest of the steps above we get a 2EXPTIME procedure, as for the case of NFA specification. Theorem 8. Synthesis under partial observability from an AFW specification is 2EXPTIME-complete. Proof (sketch). Membership is obtained by using the procedure above. Hardness follows from that for LTLf/LDLf specifications (Theorem 9), which can indeed be translated polynomially into AFW. From LTLf/LDLf specification. Finally, if we start from specifications Φ given as an LTLf/LDLf formula (or as a finite conjunctions of formulas), then complementation trivially reduces to negating the formula, getting Φ, and the translation of the formula into an NFA can be done in exponential time. Then, as before we proceed by projection, complementation and solving the resulting DFA game. The procedure in this case is 2EXPTIME in the size of the specification formula Φ. This is the same complexity of conditional planning under partial information which is 2EXPTIME-complete already. Theorem 9. Synthesis under partial observability from LTLf/LDLf specification is 2EXPTIME-complete. Proof (sketch). Membership is obtained by using the procedure above. Hardness follows from 2EXPTIME-hardness of conditional planning under partial observability [Rintanen, 2004], which is a special case. We observe that it is quite odd and interesting that for the synthesis problem remains of the same complexity starting from such different forms of specification, cf. Figure 1, although for different reasons in each case. 6 Conclusion We have studied LTLf/LDLf synthesis under partial observability, showing that the problem is 2EXPTIME-complete, as for the cases of LTLf/LDLf synthesis under full observability [De Giacomo and Vardi, 2015] and conditional planning under partial observability [Rintanen, 2004]. Unlike in [Rintanen, 2004], the classical belief-state construction, while applicable, does not give us a procedure that is optimal wrt worst-case computational complexity. This non-optimality is due to the need of, on one hand, removing angelic nondeterminism from the specification, leaving only the adversarial nondeterminism, and on other hand, dealing with the lack of knowledge coming from partial observability. The beliefstate construction handles these two aspects separately, requiring an exponential blow up for each of them. This, combined with the fact that logic allows for compact representations, gives us a 3EXPTIME procedure in the case of synthesis from general LTLf/LDLf specification. The belief-state construction gets back 2EXPTIME only when the LTLf/LDLf specification does not give rise to angelic nondeterminism, as in the notable case of conditional planning (under partial observability). The new technique based on projection and complementation proposed here is able to combine the elimination of angelic nondeterminism with the elimination of partial observation, thus getting to 2EXPTIME. Interestingly also in the case of infinite traces LTL synthesis under partial observability is 2EXPTIME-complete [Vardi, 1995; Kupferman and Vardi, 1997]. Nevertheless, the techniques available in this case, are not easily implementable, since they involve determinization of !-automata that still resists good algorithms [Fogarty et al., 2013]. The construction proposed here, instead, although as expensive in the worstcase, includes only steps for which good algorithms are available, so effective tools can indeed be developed. One final observation is that in the end we reduce our problem to solving DFA games, which are based on reachability. This is an area where the research in planning excels [Geffner and Bonet, 2013]. It would be interesting to understand to what extent planning technologies can actually be pushed to solve efficiently these forms of synthesis, see [Torres and Baier, 2015] for some recent results. Acknowledgements. This research was partially supported by the Sapienza project Immersive Cognitive Environments , by NSF grants CCF-1319459 and IIS-1527668, by NSF Expeditions in Computing project Ex CAPE: Expeditions in Computer Augmented Program Engineering , and by BSF grant 9800096. Part of this work was done while the second author was visiting the Israeli Institute for Advanced Studies. References [Albore et al., 2009] Alexandre Albore, H ector Palacios, and Hector Geffner. A translation-based approach to contingent planning. In Proc. of IJCAI, 2009. [Bacchus and Kabanza, 2000] Fahiem Bacchus and Froduald Kabanza. Using temporal logics to express search control knowledge for planning. Artificial Intelligence, 116(1-2), 2000. [Bertoli et al., 2006] Piergiorgio Bertoli, Alessandro Cimatti, Marco Roveri, and Paolo Traverso. Strong planning under partial observability. Artificial Intelligence, 170(45):337 384, 2006. [Bonet and Geffner, 2000] Blai Bonet and Hector Geffner. Planning with incomplete information as heuristic search in belief space. In Proc. of AIPS, pages 52 61, 2000. [Bryce et al., 2006] Daniel Bryce, Subbarao Kambhampati, and David E. Smith. Planning graph heuristics for belief space search. J. Artif. Intell. Res. (JAIR), 26, 2006. [Church, 1963] Alonzo Church. Logic, arithmetics, and au- tomata. In Proc. International Congress of Mathematicians, 1962. institut Mittag-Leffler, 1963. [De Giacomo and Vardi, 2013] Giuseppe De Giacomo and Moshe Y. Vardi. Linear temporal logic and linear dynamic logic on finite traces. In Proc. of IJCAI, 2013. [De Giacomo and Vardi, 2015] Giuseppe De Giacomo and Moshe Y. Vardi. Synthesis for LTL and LDL on finite traces. In Proc. of IJCAI, 2015. [De Giacomo et al., 2014] Giuseppe De Giacomo, Riccardo De Masellis, Marco Grasso, Fabrizio Maria Maggi, and Marco Montali. Monitoring business metaconstraints based on LTL and LDL for finite traces. In Proc. of BPM, 2014. [Fischer and Ladner, 1979] Michael J. Fischer and Richard E. Ladner. Propositional dynamic logic of regular programs. Journal of Computer and System Sciences, 18, 1979. [Fogarty et al., 2013] Seth Fogarty, Orna Kupferman, Moshe Y. Vardi, and Thomas Wilke. Profile trees for B uchi word automata, with application to determinization. In Proc. of Gand ALF, 2013. [Geffner and Bonet, 2013] Hector Geffner and Blai Bonet. A Concise Introduction to Models and Methods for Automated Planning. Morgan & Claypool Publishers, 2013. [Gerevini et al., 2009] Alfonso Gerevini, Patrik Haslum, Derek Long, Alessandro Saetti, and Yannis Dimopoulos. Deterministic planning in the fifth international planning competition: PDDL3 and experimental evaluation of the planners. Artificial Intelligence, 173(5-6), 2009. [Goldman and Boddy, 1996] Robert P. Goldman and Mark S. Boddy. Expressive planning and explicit knowledge. In Proc. of AIPS, 1996. [Harel et al., 2000] David Harel, Dexter Kozen, and Jerzy Tiuryn. Dynamic Logic. MIT Press, 2000. [Hoffmann and Brafman, 2005] J org Hoffmann and Ronen I. Brafman. Contingent planning via heuristic forward search witn implicit belief states. In Proc. of ICAPS, 2005. [Hopcroft et al., 2001] John E Hopcroft, Rajeev Motwani, and Jeffrey D Ullman. Introduction to Automata Theory, Languages, and Computation (2nd Edition). Addison Wesley, 2 edition, July 2001. [Kupferman and Vardi, 1997] Orna Kupferman and Moshe Vardi. Synthesis with Incomplete Informatio. Proc. ICTL, 1997. [Maliah et al., 2014] Shlomi Maliah, Ronen I. Brafman, Erez Karpas, and Guy Shani. Partially observable online contingent planning using landmark heuristics. In Proc. of ICAPS, 2014. [Pesic and van der Aalst, 2006] Maja Pesic and Wil M. P. van der Aalst. A declarative approach for flexible business processes management. In Proc. of the BPM 2006 Workshops, volume 4103 of LNCS. Springer, 2006. [Pnueli and Rosner, 1989] Amir Pnueli and Roni Rosner. On the synthesis of a reactive module. In Proc. of POPL, 1989. [Pnueli, 1977] Amir Pnueli. The temporal logic of programs. In Proc. of FOCS, 1977. [Rabin and Scott, 1959] Michael O. Rabin and Dana Scott. Fi- nite automata and their decision problems. IBM J. Res. Dev., 3(2), April 1959. [Raskin et al., 2007] Jean-Franc ois Raskin, Krishnendu Chat- terjee, Laurent Doyen, and Thomas A. Henzinger. Algorithms for omega-regular games with imperfect information. Logical Methods in Computer Science, 3(3), 2007. [Reif, 1984] John H. Reif. The complexity of two-player games of incomplete information. Journal of Computer and System Sciences, 29(2):274 301, 1984. [Rintanen, 2004] Jussi Rintanen. Complexity of planning with partial observability. In Proc. of ICAPS, 2004. [Sun et al., 2012] Yutian Sun, Wei Xu, and Jianwen Su. Declarative choreographies for artifacts. In Proc. of ICSOC, 2012. [Torres and Baier, 2015] Jorge Torres and Jorge A. Baier. Polynomial-time reformulations of LTL temporally extended goals into final-state goals. In Proc. IJCAI, 2015. [Vardi, 1995] Moshe Y. Vardi. An automata-theoretic approach to fair realizability and synthesis. In Proc. of CAV, 1995. [Vardi, 1996] Moshe Y. Vardi. An automata-theoretic approach to linear temporal logic. In Logics for Concurrency: Structure versus Automata, volume 1043 of LNCS. Springer, 1996. [Vardi, 2011] Moshe Y. Vardi. The rise and fall of linear time logic. In Proc. of Gand ALF, 2011.