# programming_by_example_using_least_general_generalizations__c353c184.pdf Programming by Example using Least General Generalizations Mohammad Raza Microsoft Research Cambridge 21 Station Road, Cambridge, CB1 2FB, U.K. a-moraza@microsoft.com Sumit Gulwani Microsoft Research Redmond One Microsoft Way, Redmond, WA 98052-6399, U.S.A. sumitg@microsoft.com Natasa Milic-Frayling Microsoft Research Cambridge 21 Station Road, Cambridge, CB1 2FB, U.K. natasamf@microsoft.com Recent advances in Programming by Example (PBE) have supported new applications to text editing, but existing approaches are limited to simple text strings. In this paper we address transformations in richly formatted documents, using an approach based on the idea of least general generalizations from inductive inference, which avoids the scalability issues faced by stateof-the-art PBE methods. We describe a novel domain specific language (DSL) that expresses transformations over XML structures describing richly formatted content, and a synthesis algorithm that generates a minimal program with respect to a natural subsumption ordering in our DSL. We present experimental results on tasks collected from online help forums, showing an average of 4.17 examples required for task completion. Introduction The area of Programming by Example, or PBE (Lieberman 2001; Gulwani 2012), has recently been gaining renewed interest, especially in the domain of text editing (Gulwani 2011; Manshadi, Gildea, and Allen 2013; Liang, Jordan, and Klein 2010; Lau et al. 2003) and has also seen successful adoption in commercial end user applications such as the recent Flash Fill feature in Microsoft Excel (Gulwani 2011). These advances have, however, been limited to relatively simple editing scenarios on small unstructured text strings and cannot, for example, address structural transformations in richly formatted documents, e.g., word processing or presentation documents created by standard office productivity tools. Performing repetitive formatting operations in such documents is a common and demanding task, particularly for long documents or across large document collections, as we observed from many user requests in online help forums. Existing features such as styles or templates allow some degree of abstraction, but are limited in functionality and suffer from discoverability issues. On the other hand, macro programming languages are very powerful but above the skillset of most end users. We propose a PBE interaction that would naturally be suited to such tasks, allowing the user to provide some examples of the desired transformations using the Copyright c 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. standard WYSIWYG editing interface, from which the system can infer a general program to apply to whole documents. An important challenge faced by PBE approaches in general is the delicate balance that must be achieved between the expressivity of the domain specific language (DSL) of transformations and the efficiency of the synthesis algorithm, because the more expressive the DSL, the more complex is the search space of possible programs, and the harder it is to effectively maintain and search within this space. For instance, recent state-of-the-art approaches use DAG based data structures to handle sophistication of the underlying transformation language (Gulwani 2011; Singh and Gulwani 2012b; Manshadi, Gildea, and Allen 2013; Singh and Gulwani 2012a). However, they only scale to relatively small unstructured strings (for example, Flash Fill only permits transformations on strings up to 256 characters) because of their complexity that is exponential in the number of examples and high degree polynomial in the size of each example. Hence, such approaches, even if they could express the underlying transformations, would not scale to our setting of transformations between large XML structures that describe richly formatted documents (most modern office suites work with XML-based file formats1). In this paper we propose a PBE approach to structural format transformations which, in contrast to above-mentioned methods, avoids explicit generation of all consistent programs by incorporating the ranking strategy implicitly into the synthesis algorithm. After all, the end goal in any PBE task is to find a single satisfying program rather than the set of all possible satisfying programs. Our approach is based on the notion of least general generalization from Plotkin s work on inductive inference (Plotkin 1970; 1971), which laid the foundations for the field of inductive logic programming. In this work the θsubsumption relation was proposed as a tractable alternative to implication when inferring generalizations over first order formulae, and the syntactic anti-unification algorithm was described for inferring a least generalization with respect to subsumption. In our case, we are not working with logic formulae; our first contribution is a DSL that expresses transformations on ordered trees (including variable, function, and 1including Open Office, Microsoft Office, Apple i Work Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence loop expressions) and we show how programs in this DSL are naturally ordered by a similar subsumption relation. Our second contribution is a program synthesis algorithm which, given a set of input-output examples, generates a program in our DSL that satisfies all the examples. Using techniques similar to syntactic anti-unification, the algorithm generates a minimal program with respect to the subsumption ordering, without constructing a represention of all possible satisfying programs, which can be exponential in the size of examples. Motivation In a repetitive formatting task, the users usually have some intended selection criteria to specify items to change, which, in general, may be independent from the actual transformation they want to apply to these items, as we have observed in help forums. The selection criteria may, for example, be all text boxes containing text with Arial font of size 12, while the desired alteration may be change every such text box to a table with a particular configuration. Hence, the programs in our DSL must accurately characterise both the selection criteria for the applicable inputs as well as the desired outputs to ensure that only items relevant to the task are affected. We consider an example of the above text box to table transformation task in Figure 1, where we use a simplified XML format to illustrate the state before and after the transformation in Figures 1(a) and 1(b) (attributes at nodes are shown inside boxes). Initially the text box contains three paragraphs (with text P1 , P2 and P3 ), and satisfies the requirement of Arial font of size 12. In the output this is changed to a table with a certain height, border, font type and color, but the same text content and bold settings as the input. In this simplified format we only show some properties for illustration although, in practice, specifications contain many more elements, attributes, and substructures. In Figures 1(c) and 1(d) we illustrate a program in our DSL that satisfies this example, specifying both the selection criteria and the transformation intended for this task. A program consists of two expressions describing the form of the applicable inputs and the outputs. The input expression in Figure 1(c) describes a tree that has a root element text Box with a certain number of child subtrees that each has a root element para, with font Arial and size 12, and any text value, color or bold setting (expressed by variables X1, X2, X3). The output expression in Figure 1(d) specifies a tree with root table and the same number of child subtrees as the input (expressed by iteration variable I1). Each subtree has root table Row and two children specifying row properties and a single cell. The text and bold values in the cell are the same as in the input (variables X1 and X3) while color, size, and font are constant. While this program specifies precisely the selection criterion and the transformation that is intended for this task, there are many other possible programs in the DSL that can satisfy the example transformation in Figures 1(a) and 1(b). For example, instead of a loop the program may assert exactly 3 paragraphs in the input or, instead of a single cell in each row, there could be a loop expression to allow for multiple cells. Similarly for attribute values, the input expression may have specified variables over all the attributes, or specify the value blue for color rather than the variable X2. Using all the examples provided for a given task, our program synthesis algorithm makes such choices by inferring a minimal generalization over the input and the output states and by mapping variables between the input and output states to generate a well-formed program satisfying all the examples. Domain Specific Language In this section we define the model of XML specifications that we use in this paper and the syntax and semantics of the domain specific language (DSL) used to express transformations on these structures. We model XML specifications as ordered trees in which every node has an associated element and a map from attributes to values. Definition 1 (Tree model) Given fixed sets of elements Elm, attributes Att and literal values Lit, we define trees t T by the following grammar: t := 0 | r | t t r := (e, m)[t] where 0 is the empty tree, r is a rooted tree and t t is a parallel composition of trees as ordered siblings. A rooted tree r is of the form (e, m)[t] with root node (e, m) and subtree t. A node (e, m) consists of an element e Elm and a partial map from attributes to literal values m : Att Lit. We impose the structural congruences that parallel composition is associative, i.e., (t1 t2) t3 = t1 (t2 t3), and has identity 0, i.e., t 0 = 0 t = t. We next define the DSL for transforming XML trees which is agnostic of any particular XML schemas and hence generically applicable. The syntax of the DSL is given in Figure 2, which we first discuss before giving the formal semantics in the next section. A program Prog(τ1, τ2) consists of a pair of tree expressions τ1, τ2, where τ1 describes the applicable input trees to the program and τ2 describes the output tree. For example, Figures 1(c) and 1(d) illustrate the input and output tree expressions for the text to table program. A tree expression is a parallel composition of atomic expressions, where an atomic expression is either a rooted tree expression or a loop expression. A rooted tree expression (e, φ)[τ] describes a rooted tree that has root element e, attribute map described by φ and subtree described by τ. An attribute map expression φ may use literals, variables or function expressions to express the values that a particular attribute may take. For example, in Figure 1(d), there is a rooted subtree with root element table Cell at which the text and bold attributes have variable values while color , font and size attributes have literal values. Function expressions may be used to describe transformations of attribute values from input to output. For example, in Figure 1(d) we could have text = f(X1) to indicate that the text value at that node changes according to some text transformation function f (e.g., syntactic transformations such as name or date format changes (Gulwani 2011)). On numeric-valued attributes we may have, for example, size = X in the input and size = X + 3 in the out- text= P3 color= blue font= Arial size= 12 bold= 1 text= P2 color= blue font= Arial size= 12 bold= 1 text= P1 color= blue font= Arial size= 12 bold= 0 (a) Example input state text= P3 color= red font= Times size= 12 bold= 1 border= 1 height= 2 text= P2 color= red font= Times size= 12 bold= 1 border= 1 height= 2 text= P1 color= red font= Times size= 12 bold= 0 border= 1 height= 2 (b) Example output state Loop(I1, para) text= X1 color= X2 font= Arial size= 12 bold= X3 (c) Program input Loop(I1, table Row) text= X1 color= red font= Times size= 12 bold= X3 border= 1 height= 2 (d) Program output Figure 1: Example input and output states and satisfying program for the text box to table transformation task put. In general, we assume a fixed set of functions between literal values that are available in our DSL. Loop expressions allow for structural variations in the trees by expressing an arbitrary number of sibling trees in a parallel composition. An expression Loop(I, ρ) describes any number of sibling trees, each described by rooted tree expression ρ. The loop iterator I is used to relate loop expressions between input and output tree expressions. For example, in Figure 1 there are loop expressions in the program input and output that are related by the iterator I1. When applying the program to an input such as in Figure 1(a), the iterator I1 is used to determine the number of loop iterations when instantiating the output (3 in this case) as well as the valuations of the variables for each iteration (X1, X2 and X3). Note that the language does not permit nested loop expressions at the same level of parallel composition (e.g. (I1, (I2, ρ)) but loop expressions can be nested inside subtrees (e.g. (I1, ρ) where ρ contains I2). For example, if every paragraph in Figure 1(a) were fragmented into a number of runs and each run should appear in a separate column in the output table, that may be expressed by a program in our DSL by using a nested loop expression to represent varying number of table cells in each table row. We define Var(τ), Iter(τ) and FExp(τ) for the set of variables, iterators, and function expressions that appear in tree expression τ respectively. These sets are similarly defined for programs, atomic and rooted tree expressions, and attribute map expressions. We define the set of root elements in τ as Root(τ) = {e} τ = (e, φ)[τ ] or τ = Loop(I, (e, φ)[τ ]) [ i 1..n Root(αi) τ = α1 . . . αn The well-formedness constraint ( ) in Figure 2 ensures unambiguous matching of expressions with concrete trees, as well as tractable inference in the presence of loops. Note that every concrete tree is a valid tree expression, since any t T is an expression τ with Var(τ) = Iter(τ) = FExp(τ) = . Semantics When a program Prog(τ1, τ2) is applied to a concrete tree t, a valuation of variables and iterators is inferred for the input Program P := Prog(τ1, τ2) Tree Exp τ := 0 | α | τ τ with constraint ( ) Atomic Tree Exp α := ρ | Loop(I, ρ) where I Iter Rooted Tree Exp ρ := (e, φ)[τ] where e Elm Att Map Exp φ : Att Val Val := Lit Var FExp FExp := f(X) where f : Lit Lit X Var Var := X, X1, X2, . . . Iter := I, I1, I2, . . . ( ) if τ = α1 . . . αn then τ is well-formed if j