# ltl_realizability_via_safety_and_reachability_games__af6c2cee.pdf LTL Realizability via Safety and Reachability Games Alberto Camacho , Christian Muise , Jorge A. Baier , Sheila A. Mc Ilraith Department of Computer Science, University of Toronto IBM Research, Cambridge Research Center, USA Pontificia Universidad Cat olica de Chile Chilean Center for Semantic Web Research {acamacho,sheila}@cs.toronto.edu, christian.muise@ibm.com jabaier@ing.puc.cl In this paper, we address the problem of LTL realizability and synthesis. State of the art techniques rely on so-called bounded synthesis methods, which reduce the problem to a safety game. Realizability is determined by solving synthesis in a dual game. We provide a unified view of duality, and introduce novel bounded realizability methods via reductions to reachability games. Further, we introduce algorithms, based on AI automated planning, to solve these safety and reachability games. This is the the first complete approach to LTL realizability and synthesis via automated planning. Experiments illustrate that reductions to reachability games are an alternative to reductions to safety games, and show that planning can be a competitive approach to LTL realizability and synthesis. 1 Introduction LTL synthesis aims to compute a strategy that satisfies a Linear Temporal Logic (LTL) specification. The problem was first proposed in the context of reactive synthesis, and is central to the automated construction of controllers and certain classes of programs [Pnueli and Rosner, 1989]. LTL realizability and synthesis are commonly addressed as a two-player game between an agent and the environment, played over automata transformations of the LTL specification (so-called automata games). Each player has a disjoint set of variables, and the objective is to synthesize a strategy for setting the agent s ( controllable ) variables such that the LTL specification is guaranteed to be satisfied, no matter how the environment sets its ( uncontrollable ) variables. Traditional approaches to LTL realizability and synthesis assume that the agent plays first. Interestingly, in recent years there has been a surge in the development of modern LTL synthesis tools based on automata games, and whereas LTL synthesis tool Lily [Jobstmann and Bloem, 2006] adopts an agent-first play protocol, many of the most successful This work was conducted while the author was affiliated with CSAIL, Massachusetts Institute of Technology. LTL synthesis tools such as Unbeast [Ehlers, 2010], Acacia, and Acacia+ [Filiot et al., 2009; Bohy et al., 2012] adopt an inverted turn protocol, where the environment plays first. Indeed, this inverted turn-taking has become the standard used in different incarnations of SYNTCOMP, the annual reactive synthesis competition (e.g. [Jacobs et al., 2017]). SYNTCOMP has promoted standardization and facilitated tool comparison with the introduction of the high-level LTL input language TLSF [Jacobs et al., 2016]. Participating tools (including Acacia+) are compliant with TLSF. To describe the different semantics associated with turntaking protocols, the notions of Mealyand Moore-type semantics has been introduced (e.g. [Ehlers, 2011; Khalimov et al., 2013]), where intuitively Mealy-type semantics correspond to settings where the environment plays first, and Moore-type semantics correspond to settings where the agent plays first. The names originate from Mealy and Moore machines. When a winning strategy exists for one of these semantics, the strategy has a finite representation and is typically conveyed in terms of Mealy (resp. Moore) machines. Inverting the order of agent-environment turn-taking corresponds to playing a dual game. We aim to provide a unified view of duality results for LTL realizability and synthesis, and automata games. To this end, we review the connection between LTL synthesis and automata games, and formalize a duality result for existence of winning strategies in automata games that replicates the duality results for synthesis. We further investigate different reductions to automata games that exploit duality to determine realizability, and introduce novel techniques that exploit these correspondences. We exploit mappings from LTL synthesis to games, together with duality results, to chronicle different techniques to solve LTL realizability and synthesis (Table 1). These techniques extend well-known approaches, and establish the connection between Mealy and Moore semantics for LTL realizability and synthesis, and games over Universal k-co B uchi Word (Uk CW) automata and Non-deterministic k-B uchi Word (Nk BW) automata. From an algorithmic perspective, our approaches to LTL realizability via reduction to Nk BW games provide an alternative to existing techniques based on Uk CW. Whereas the latter are safety games, the former are reachability games. Finally, we address the realizability problem in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) safe and co-safe specifications, and show how these can be solved more efficiently. We further present the first complete approach to LTL synthesis via Fully Observable Non-Deterministic (FOND) planning. Our approach constructs a planning problem that simulates an automata game, and that can be solved using state-ofthe-art planners. While our contributions are intended to be largely theoretical, we conducted preliminary experiments to assess the practicality of the different techniques for LTL realizability and synthesis that we present. Preliminary results show that techniques employing Nk BW automata can complement existing techniques based on Uk CW automata, addressing realizability and synthesis problems that otherwise cannot be solved by existing tools. Moreover, they show that automated planning technology (and, in particular, FOND) can be an efficient approach to LTL realizability and synthesis. Finally, we believe that the Nk BW game reductions presented here open the door to novel bounded realizability methods for LTL realizability and synthesis as reachability games, providing an alternative to bounded synthesis [Kupferman and Vardi, 2005; Schewe and Finkbeiner, 2007]. 2 Background 2.1 Automata on Infinite Words A non-deterministic automaton is a tuple A = Q, Σ, q0, δ, α , where Q is a finite set of automaton states, Σ is a finite alphabet of input letters, q0 Q is the initial state of the automaton, δ Q Σ Q is a transition relation, and α Q is an accepting condition. We sometimes write δ(q, σ) to denote the set Q Q such that (q, σ, q ) δ for every q Q . Intuitively, an automaton A in state q can transition into any state in δ(q, σ) when it reads an input letter σ. When |δ(q, σ)| = 1 for every q Q and σ Σ, the automaton is deterministic. Σω denotes the set of all infinite words over Σ. A run of A on word w = σ1, σ2, . . . Σω is an infinite sequence of states ρ = q0, q1, . . . Qω where (qi 1, σi, qi) δ for every i > 0. It is useful to define operators occρ : Q N {0, }, indexed by runs ρ, that return for each q Q the number of occurrences of q in ρ. In what follows, we review three types of acceptance conditions. A run ρ is accepting with the B uchi condition if occρ(q) = for some q α that is, when some state in α occurs infinitely often in ρ. Conversely, a run ρ is accepting with the co B uchi condition if occρ(q) < for every q α. Finally, ρ is accepting with the k-B uchi (resp. k-co B uchi) condition for k N {0} if occρ(q) k for some q α (resp. occρ(q) k for every q α). The language of an automaton A, L(A), is the set of infinite words accepted by A. We say a non-deterministic automaton A accepts a word w when some run of A on w is accepting that is, A has non-deterministic branching factor. In contrast, when A is a universal automaton all runs of A on w must be accepting that is, A has universal branching factor. Following the naming conventions described above, Nondeterministic B uchi Word (NBW) automata and Nondeterministic k-B uchi Word (Nk BW) automata have nondeterministic branching factor and B uchi and k-B uchi acceptance condition, respectively. Similarly, Universal Co B uchi Word (UCW) automata and Universal k-co B uchi Word (Uk CW) automata have universal branching factor and co B uchi and k-co B uchi acceptance condition, respectively. For clarity, throughout the paper we use a pairwise notation to denote the acceptance condition of automata. For example, Uk = (A, Uk CW) and Nk = (A, Nk BW) denote automaton A with, respectively, Uk CW and Nk BW acceptance conditions. This notation makes it easy to write automata that differ in their accepting conditions. In particular, it facilitates notation in complementation rules (cf. Proposition 1). The complementation of an automaton A is the task of constructing an automaton that accepts the words that are not in L(A). Proposition 1. The complementation of Uk is Nk+1. 2.2 Linear Temporal Logic Linear Temporal Logic (LTL) extends propositional logic with the unary temporal operator next and the binary temporal operator until . As such, ϕ stands for next ϕ and ϕUψ stands for ϕ until ψ . LTL formulae over a set of propositions AP are evaluated over infinite sequences s0s1 , where each si is a subset of AP. If σ = s0s1 , then, for every i 0: σ, i |= ϕ if ϕ is a propositional formula and si |= ϕ. σ, i |= ϕ iffσ, i + 1 |= ϕ. σ, i |= ϕUψ iffthere exists a j i such that sj |= ψ and for every k {i, . . . , j 1}, sk |= ϕ. Finally we say that σ is a model of ϕ, denoted σ |= ϕ, iff σ, 0 |= ϕ. Three other common temporal operators, eventually ϕ ( ϕ), always ϕ ( ϕ), and ψ releases ϕ (ψRϕ), can be defined as follows: ϕ Uϕ, ϕ ϕ, and ψRϕ ( ψU ϕ). 2.3 LTL and Automata It is well-known that for an LTL formula ϕ one can construct an automaton Nϕ = (Aϕ, NBW) that accepts all and only the infinite words that satisfy ϕ. The construction is worst-case exponential in the size of the formula. An UCW automaton Uϕ that accepts the models of ϕ can be obtained from the construction of an NBW automaton that accepts the models of ϕ. Formally, Uϕ = (A, UCW) accepts the models of ϕ iff N ϕ = (A, NBW) accepts the models of ϕ (cf. [Kupferman and Vardi, 2005]). Lemma 1 summarizes these results. Lemma 1. Given an LTL formula ϕ, it is possible to build an NBW automaton Nϕ, and a UCW automaton Uϕ, such that L(Nϕ) = L(Uϕ) = {w Σω | w |= ϕ}. The constructions are worst-case exponential in the size of ϕ. 2.4 Fully Observable Non-Deterministic Planning Here we briefly describe the Fully Observable Non Deterministic (FOND) setting, and refer the interested reader to Geffner and Bonet [2013] for a more thorough description. A FOND problem is a tuple F, O, I,G , where F is a set of fluents (atomic propositions whose truth value can change over time); I F are the fluents true in the initial state; G F, are the fluents that must hold in a goal state; and O is the set of (possibly non-deterministic) operators or action schemas. Each o O is made up of Preo (the fluents that must hold for o to be executable) and Effo (the set of possible effects for o, one of which will update the state). Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) States are represented by a set of fluents (all those not in the set are presumed false) and partial states are represented by a set of literals (i.e., fluents or their negation). An action is a ground operator. The outcome of action a in state s is another state s that results from updating s with one of the effects of the ground operator o corresponding to a. A distinguishing property of FOND planning is that the action outcome is not known until run time, and so all contingencies must be planned for. For this reason, solutions to a FOND problem take the form of a policy π that maps the state of the world to the action that should be executed. An execution of a policy π from state s0 is a sequence of states s0, s1, . . . such that each si+1 is an outcome of π(si) in si. Cimatti et al. [2003] defined different classes of solutions to a FOND problem. Here, we examine two of them. A strong solution is a policy that is guaranteed to achieve the goal regardless of non-determinism. Strong cyclic solutions are less restrictive and guarantee goal reachability in the presence of fairness, which presumes that all the action outcomes in a given state would manifest infinitely often. As the name suggests, a strong cyclic solution may revisit states. Strong cyclic solutions have the property stated in Proposition 2. Proposition 2 (Strong Cyclic Solution). A policy π is a strong cyclic solution ifffor every state s reachable by π there exists an execution that achieves the goal from s. Automated planning technology is highly optimized, and typically used for computation of plans or policies that achieve a prescribed goal after execution of a finite number of actions that can be unbounded in strong cyclic plans. Patrizi et al. [2013] proposed a means of synthesizing infinite plans for planning domains with LTL goal formulae and fair non-deterministic actions. Their technique compiles the original domain into a new domain in a way that executions of a strong cyclic solution to the modified domain produce infinite plans in the original domain. The approach is limited to the class of LTL formulae that can be compiled into deterministic B uchi automata. 3 LTL Realizability and Synthesis The realizability and synthesis problems for an LTL specification were first posed by Pnueli and Rosner in 1989. Since then, the scholarly literature has explored two slightly different interpretations of an LTL specification in the context of synthesis. In this paper, we study both interpretations, which differ in their semantics, distinguishing them via the subscript s Mealy, Moore notation (cf. Definition 1). We adopt the terms Mealy and Moore semantics, which have been used previously in the synthesis literature (e.g. [Ehlers, 2011; Khalimov et al., 2013]). A strategy for an LTL specification X, Y, ϕ s is a function f : (2X) 2Y that maps histories, or finite sequences of subsets of X, into subsets of Y. A strategy f is winning when, for every infinite sequence X1X2 of subsets of X, either: (i) s = Mealy and {(Xi f(X1 Xi))}i 1 satisfies ϕ; or (ii) s = Moore and {(Xi f(X0 Xi 1))}i 1 satisfies ϕ for some constant X0, typically set to the empty trace ϵ. Winning strategies for an LTL specification X, Y, ϕ Moore are also winning strategies for X, Y, ϕ Mealy, but the opposite does not hold in general (see example below). The realizability problem consists of determining existence of a winning strategy, and the synthesis problem consists of computing one (Definition 2). For unrealizable specifications, a certificate of unrealizablity is a strategy g : (2Y) 2X that prevents the agent from realizing ϕ. Lemma 2 formulates a well-known duality between the two semantics in LTL realizability. Definition 1. An LTL specification is a tuple X, Y, ϕ s, where X and Y are two finite and disjoint sets of variables, ϕ is an LTL formula over variables in X Y, and s {Mealy, Moore}. Definition 2. The realizability problem for an LTL specification X, Y, ϕ s is to determine whether there exists a winning strategy. The synthesis problem for a realizable LTL specification consists of computing a winning strategy. We say that an LTL specification is unrealizable when it is not realizable. Lemma 2 (Duality). An LTL specification X, Y, ϕ Mealy is realizable iff Y, X, ϕ Moore is unrealizable. Example LTL specification {x} , {y} , (x y) Mealy is realizable, and has a winning strategy f that maps each sequence X1 Xn (2X) to {y} if Xn = {x}, and to otherwise. On the other hand, {x} , {y} , (x y) Moore is unrealizable. An unrealizability certificate g : (2Y) 2X can be constructed by assigning g({y}) = and g( ) = {x}. Clearly, plays that start with a prefix (g(f(X0)) f(X0)) cannot satisfy (x y). 3.1 Automata Games Following Thomas [1995], in this paper a two-player game is defined with respect to two finite sets of variables, X and Y, and an automaton A with alphabet Σ = 2X Y whose language describes which plays are winning in the game. In this paper we exploit the correspondence between LTL synthesis and games in various forms. To facilitate readability, in this paper we opt to represent games with a tuple X, Y, A s, where s Moore, Mealy . In the following, we formalize automata games, and present duality results that are analogous to those for LTL synthesis. We first describe the dynamics of a game X, Y, A Moore, which corresponds to the definition of automata games commonly used in the literature. A game X, Y, A Moore has two players, P1 and P2. In each round of the game, P1 selects a subset of variables Yi Y, followed by P2, which selects a subset of variables Xi X. The game is played an infinite number of rounds, and so a play consists of an infinite word w = {(Xi Yi) | Xi X, Yi Y}i 1. The play is winning (for P1) if A accepts w, and the game is winning (for P1) if P1 has a strategy that only yields winning plays, regardless of the moves of P2. Formally, X, Y, A Moore is winning if there exists an strategy f : (2X) 2Y such that, for any sequence {Xi}i 1 of moves of P2, A accepts the infinite word w = {(Xi f(X0 Xi 1))}i 1 for some constant X0. In this paper we are also interested in examining the conditions under which player P2 has a winning strategy. We write X, Y, A Mealy to denote a game with inverted turns with respect to X, Y, A Moore. In each round, player P1 selects a subset of variables of X, followed by player P2, which selects a subset of variables of Y. The difference now is Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) that the game is P2-centric. We say that the game is winning (for P2) if there exists an strategy g : (2X) 2Y such that, for any sequence {Xi}i 1 of moves of P1, the play w = {(Xi f(X1 Xi))}i 1 is accepted by A. Proposition 3 below is analogous to Lemma 2, and formalizes the well-known zero-sum property of automata games. That is, player P1 has a strategy to win a game X, Y, A Moore iffplayer P2 has no strategy to win the dual game Y, X, A Mealy, where A is an automaton that accepts the complement language of A. Note that duality inverts the set of controllable variables. Throughout this paper we exploit a particular case of Proposition 3 with two classes of automata that recognize complement languages: Uk CW and Nk BW automata. We formalize the result in Lemma 3. Proposition 3 (Duality). A game X, Y, A Moore is winning (for P1) iffthe game Y, X, A Mealy is not winning (for P2). Lemma 3 (Duality). Let Uk = (A, Uk CW) and Nk = (A, Nk BW). A game X, Y, Uk Moore is winning iff Y, X, Nk+1 Mealy is not winning. Likewise, X, Y, Uk Mealy is winning iff Y, X, Nk+1 Moore is not winning. 3.2 LTL Realizability and Automata Games The correspondence between LTL realizability and automata games has been commonly exploited to approach LTL realizability and synthesis (e.g. [Jobstmann and Bloem, 2006; Filiot et al., 2009; Ehlers, 2010]). Theorem 1 formalizes the mapping, which preserves winning strategies. Typically, automata-based approaches to LTL synthesis rely on deterministic automata. NBW automata can be determinized (into Rabin automata) with the Safra construction. This construction, unfortunately, is very involved and not amenable to symbolic methods. Reportedly, the best tools cannot even handle relatively simple automata with more than 6 states. Theorem 1. Let ϕ be an LTL formula over X Y, and let Aϕ be an automaton that accepts the models of ϕ. σ is a winning strategy for X, Y, ϕ Moore (resp. X, Y, ϕ Mealy) iffσ is a winning strategy for X, Y, Aϕ Moore (resp. X, Y, Aϕ Mealy). Safety Games As an alternative to avoid Safra s construction, so-called safraless methods reduce the synthesis problem to a series of safety games over Uk CW automata [Kupferman and Vardi, 2005]. In a safety game, all plays that violate the LTL specification formula ϕ have a bad prefix. A bad prefix is a finite trace π from which it is no longer possible to satisfy ϕ. That is, any infinite trace π with prefix π is such that π |= ϕ. The challenge in a safety game is for the agent to avoid bad prefixes. The advantage of using Uk CW automata is that these can be determinized efficiently, using well-known methods derived from the powerset construction commonly used to determinize finite-state automata. Reachability Games In a reachability game, winning plays have a good prefix, that is, a finite trace π such that any infinite trace π with prefix π satisfies the specification. In particular, games over Nk BW automata are reachability games. 4 Reductions to Automata Games In this section we introduce a collection of techniques to reduce LTL realizability into a series of games defined over Uk CW and Nk BW automata. The reductions rely on Theorems 2, 3, 4, and 5, and are summarized in Table 1. Theorem 2. Let A = (A, UCW) be an automaton that accepts the models of ϕ. The following are equivalent: X, Y, ϕ Mealy is unrealizable Y, X, (A, Uk CW) Moore is winning for some k in 2O(|A|) X, Y, (A, Nk BW) Mealy not winning for some k in 2O(|A|) Proof sketch. By Lemma 2, X, Y, ϕ Mealy is unrealizable iff Y, X, ϕ Moore is realizable. By well-known results on bounded synthesis (e.g. [Kupferman, 2006]), this occurs iff Y, X, (A, Uk CW) Moore is winning for some k in 2O(|A|). By Lemma 3, this occurs iff X, Y, (A, Nk BW) Mealy is not winning for some k worst-case exponential in the size of A. Theorem 3. Let A = (A, NBW) be an automaton that accepts the models of ϕ. The following is equivalent: X, Y, ϕ Mealy is realizable Y, X, (A, Nk BW) Moore is not winning for some k in 2O(|A|) X, Y, (A, Uk CW) Mealy is winning for some k in 2O(|A|) Proof sketch. By Lemma 2, X, Y, ϕ Mealy is realizable iff Y, X, ϕ Moore is unrealizable. This occurs iff Y, X, (A, Nk BW) Moore is not winning for some k < . By Lemma 3, this occurs iff X, Y, (A, Uk CW) Mealy is winning for some k < . The results for bounded synthesis (e.g. [Kupferman, 2006]) set the exponential bound on k. Theorem 4. Let A = (A, UCW) be an automaton that accepts the models of ϕ The following is equivalent: X, Y, ϕ Moore is realizable X, Y, (A, Uk CW) Moore is winning for some k in 2O(|A|) Y, X, (A, Nk BW) Mealy is not winning for some k in 2O(|A|) Proof sketch. It follows from negating Theorem 2 and applying the duality results in Lemmas 2 and 3. Theorem 5. Let A = (A, NBW) be an automaton that accepts the models of ϕ. The following is equivalent: X, Y, ϕ Moore is unrealizable X, Y, (A, Nk BW) Moore is not winning for some k in 2O(|A|) Y, X, (A, Uk CW) Mealy is winning for some k in 2O(|A|) Proof sketch. It follows frin negating Theorem 3 and applying the duality results in Lemmas 2 and 3. Bounded Realizability and Synthesis Table 1 summarizes the results of the theorems above into a series of tests to determine realizability and unrealizability of an LTL specification. The tests based on reductions to Uk CW games (i.e. safety games) are well known, and commonly referred to as bounded synthesis methods [Schewe and Finkbeiner, 2007], which build upon safraless methods [Kupferman and Vardi, 2005]. These tests are constructive, meaning that a winning strategy for the reduced game is a winning strategy for the original specification in case it is realizable , or a certificate of unrealizability in case it is unrealizable. Bounded synthesis has been a successful approach to synthesis, and nowadays many modern tools exploit reductions to Uk CW games in some way (e.g. Lily [Jobstmann and Bloem, 2006], Unbeast [Ehlers, 2010], and Acacia/Acacia+ [Filiot et al., 2009; Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) Specification Construction of A Test Good for Constructive Thm X, Y, ϕ Mealy L(A, NBW) = L(ϕ) k. Y, X, (A, Uk CW) Moore winning for P1 unrealizable co-safe yes 2 X, Y, ϕ Mealy L(A, NBW) = L(ϕ) k. X, Y, (A, Nk BW) Mealy not winning for P2 unrealizable co-safe no 2 X, Y, ϕ Mealy L(A, UCW) = L(ϕ) k. Y, X, (A, Nk BW) Moore not winning for P1 realizable safe no 3 X, Y, ϕ Mealy L(A, UCW) = L(ϕ) k. X, Y, (A, Uk CW) Mealy winning for P2 realizable safe yes 3 X, Y, ϕ Moore L(A, UCW) = L(ϕ) k. X, Y, (A, Uk CW) Moore winning for P1 realizable safe yes 4 X, Y, ϕ Moore L(A, UCW) = L(ϕ) k. Y, X, (A, Nk BW) Mealy not winning for P2 realizable safe no 4 X, Y, ϕ Moore L(A, NBW) = L(ϕ) k. X, Y, (A, Nk BW) Moore not winning for P1 unrealizable co-safe no 5 X, Y, ϕ Moore L(A, NBW) = L(ϕ) k. Y, X, (A, Uk CW) Mealy winning for P2 unrealizable co-safe yes 5 Table 1: Strategies to determine realizability and unrealizability of an LTL specification via reduction to automata games. When ϕ is a safe (resp. co-safe) LTL formula, strategies that are good for safe (resp. good for co-safe) formulae only require tests with k = 0 in Uk CW games; and k = 1 in Nk BW games. Constructive tests yield a winning strategy or certificate, as appropriate. Bohy et al., 2012]). The tests based on reductions to Nk BW games (i.e. reachability games) are novel. These tests determine realizability and unrealizability. By way of analogy to bounded synthesis, we refer to them as bounded realizability methods. Note that a complete approach for LTL realizability and synthesis would combine at least two of these approaches in parallel or with interleaved search. Safe and Co-Safe LTL Specifications Different syntactic characterizations of LTL formulas have been studied (e.g. [Sistla, 1994]). Here, we examine the safe and co-safe fragment of LTL [Kupferman and Vardi, 2001]. Safe properties assert that something bad never happens. Co-safe properties are dual, and assert that something good eventually happens. The following syntactic fragments of LTL express safe, and co-safe properties: Safe: ϕ = | | p | p | ϕ ϕ | ϕ | ϕRϕ Co-safe: ϕ = | | p | p | ϕ ϕ | ϕ | ϕUϕ Safe and co-safe LTL formulae can be transformed into automata with absorbing rejecting and accepting states, respectively [Kupferman and Vardi, 2001]. We observe that with these automata constructions, realizability of safe and co-safe formulae can be determined by performing only one test (Theorem 6). Theorem 6. LTL realizability for safe (resp. co-safe) specifications can be determined with strategies that are good for safe (resp. good for co-safe) formulae (cf. Table 1) by performing a single test with k = 0, in the case of reductions to Uk CW games, and k = 1, in the case of reductions to Nk BW games. Proof sketch. The proof follows from applying Lemma 3 in Theorems 2, 3, 4, and 5, with a construction of an NBW with absorbing accepting states. Here, we sketch the proof of one case. Let ϕ be a co-safe LTL formula, and let A = (A, NBW) be an automaton that accepts the models of ϕ. X, Y, ϕ Moore is realizable iff X, Y, (A, Nk BW) Moore is winning for all k, but this happens iff X, Y, (A, Nk BW) Moore is winning for k = 1 (because A has absorbing accepting states). Lemma 3 derives that this only happens iff Y, X, (A, Uk CW) Mealy is not winning for k = 0. 5 Automata Games via Planning In recent years there has been increased interest in the development of algorithms that exploit automated planning tech- niques to synthesize programs of varying sorts including so-called generalized planning (e.g., [Aguas et al., 2016; Bonet et al., 2017]) and LTL synthesis. Camacho et al. [2018b] studied the correspondence between LTL synthesis and planning, and presented an algorithm to synthesize strategies, via planning, that is complete when the formula is transformed into deterministic B uchi automata, but not in general. Together with this work, it constitutes the first approach to LTL synthesis via planning. Also related to this work is [Camacho et al., 2018a], where planning technology is used to solve LTL synthesis for LTL interpreted over finite words [De Giacomo and Vardi, 2015]. Approaching LTL synthesis as planning enables the use of highly optimized search techniques and heuristics. Planning algorithms can reason about action costs, stochasticity, plan quality, and preferences. Our work opens the door to using planning technology for richer notions of LTL synthesis, including those that optimize for high-quality strategies [Almagor and Kupferman, 2016]. In this section we present reductions of Uk CW and Nk BW games to FOND planning. These reductions comprise the first complete approach to LTL realizability and synthesis via planning. For an LTL specification X, Y, ϕ , and a test strategy from Table 1, our approach consists of three steps: (1) construct an automaton A (2) solve a sequence of automata games via FOND planning (3) determine realizability or unrealizability In addition, and if the test is constructive, a fourth step can be performed to synthesize a winning strategy if realizable or a certificate of unrealizability if unrealizable. Step 1: LTL to automata In the first step we construct an automaton (A, NBW) that accepts the models of ϕ or ϕ, according to the strategy selected from Table 1. Our FOND compilation requires post-processing of the set of automaton transitions. Each transition t = (q, guard(t), q ) in A with guard guard(t) = W i ci in DNF is decomposed into a set of transitions ti = (q, ci, q ). Each ti has guard ci, that is a conjunctive formula over variables in X Y. For each ti, we denote by macro(ti) the set of transitions derived from t. It can be proved that the decomposition of the transitions does not affect the language accepted by the automaton. Step 2: Construction of a FOND Problem The second step takes as input the automaton generated in Step 1, and constructs a series of FOND problems Pk, for k 0. Each Pk Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) simulates a game over a Uk CW or Nk BW automaton in accordance with the strategy selected from Table 1. We detail the FOND compilations for Uk CW and Nk BW automata games in Sections 5.1 and 5.2. Step 3: Determine Realizability or Unrealizability The compiled FOND problems Pk have the following property: Pk has a solution iffthe simulated automata game is winning. This property is exploited in the following manner: our technique searches for existence of solutions to the FOND problems Pk for k = 0, 1, . . . until the test in Table 1 is passed. Note that the tests are not guaranteed to terminate if, for instance, LTL realizability for an unrealizable specification is checked. However, termination can be achieved by combining two strategies in parallel: one that checks realizability, and another that checks unrealizability. Step 4: Construction of a strategy A finite state controller that implements a strategy that solves the synthesis p roblem for specification ϕ can be constructed from a policy for Pk. See [Patrizi et al., 2013; Camacho et al., 2017] for details. 5.1 Uk CW Games via Planning Figure 1 describes the details of the compilation of a game with automaton A = Q, Σ, q0, δ, α into a FOND planning problem Pk. For illustration purposes, we assume the game is in the form X, Y, (A, Uk CW) Mealy, that is, we are searching for winning strategies for P2. At the end of the section we show how the dynamics of Pk can be modified to compile games in the form X, Y, (A, Uk CW) Moore. The dynamics of Pk simulates an infinite sequence of rounds in the two-player game. Each round is divided into three sequential stages. The first stage simulates the move of the environment and agent players. The move of P1 is simulated with a cascade of actions that non-deterministically assign a value to the variables in X. In turn, the moves of P2 are simulated with a cascade of deterministic actions that assign a value to the variables in Y. As a consequence of the player moves, some automaton transitions are deemed unfeasible. The second stage performs all automaton transitions. Fluents F(q) are true in a planning state if there exists a run of A on the play being simulated that ends in q. The third stage sets the value of the counters associated to each automaton state q. A fluent new Cnt(q, m) is true if there exists a run of A on the simulated play that ends in q and hits m accepting states. The fluent old Cnt(q, m) keeps track of the maximum number of hits for a fluent q. At the end of the third stage, the first stage is reestablished to simulate the next round. There are two important things to notice about the dynamics of Pk. First, when an input word being simulated is not accepted by (A, Uk CW) (that is, a run visits more than k accepting states), a deadend in Pk is incurred. This is because a fluent new Cnt(q, k + 1) prevents sync F actions to disable the fluent F(q), action continue is no longer applicable, and the first stage can no longer be reestablished. Second, the non-determinism of action continue ensures that execution of strong-cyclic solutions yield an infinite number of rounds in the game being simulated (cf. [Patrizi et al., 2013]). Finally, the turn-taking of player moves mandated by the semantics of the specification (Moore vs. Mealy) is forced by Components of the Safe2FOND compilation: F B {turn(vi)}0 i |X Y| {sync, goal} {poss(t)}t T {old Cnt(q, m), new Cnt(q, m)}q Q,0 m k I B {sync, old Cnt(q0, 0)} {safe(t)}t Safe(T) {poss(t)}t.orig=q0 Safe(T) B {(q, σ, q ) T | q < α} O B {move X(v)}v X {move Pos Y(v), move Neg Y(v)}v Y {trans Acc(t, m), trans Rej(t, m)}t T,0 m k {start Sync, continue} {sync F(q, m)}q Q,0 m k G B {goal} Stage 1: Actions that simulate X and Y moves: Premove X(vi) = {turn(vi)} Effmove X(vi) = {turn(vi+1), turn(vi)} oneof(e1, e2) Premove Pos Y(vi) = {turn(vi)} Effmove Pos Y(vi) = {turn(vi+1), turn(vi)} e1 Premove Neg Y(vi) = {turn(vi)} Effmove Neg Y(vi) = {turn(vi+1), turn(vi)} e2 e1 = { poss(t)}t T, vi Lits(guard(t)) e2 = { poss(t)}t T,vi Lits(guard(t)) Stage2: Actions that simulate transitions, t T, 0 m k: Pretrans Rej(t,m) = turn(v|X Y|), poss(t), safe(t), old Cnt(q, m) Efftrans Rej(t,m) = {F(q ), new Cnt(q , m)} { poss(t )}t macro(t) Pretrans Acc(t,m) = turn(v|X Y|), poss(t), safe(t), old Cnt(q, m) Efftrans Acc(t,m) = {F(q ), new Cnt(q , m + 1)} { poss(t )}t macro(t) Stage 3: Actions that synchronize the automaton, q Q, 0 m k: Prestart Sync = turn(v|X Y|) { poss(t)}t T Effstart Sync = sync, turn(v|X Y|) { old Cnt(q, m)}q Q,0 m k Presync F(q,m) = {sync, F(q), new Cnt(q, m)} { new Cnt(q, n)}m