# learning_loop_invariants_for_program_verification__b07eec48.pdf Learning Loop Invariants for Program Verification Xujie Si University of Pennsylvania xsi@cis.upenn.edu Hanjun Dai Georgia Tech hanjundai@gatech.edu Mukund Raghothaman University of Pennsylvania rmukund@cis.upenn.edu Mayur Naik University of Pennsylvania mhnaik@cis.upenn.edu Le Song Georgia Tech and Ant Financial lsong@cc.gatech.edu A fundamental problem in program verification concerns inferring loop invariants. The problem is undecidable and even practical instances are challenging. Inspired by how human experts construct loop invariants, we propose a reasoning framework CODE2INV that constructs the solution by multi-step decision making and querying an external program graph memory block. By training with reinforcement learning, CODE2INV captures rich program features and avoids the need for ground truth solutions as supervision. Compared to previous learning tasks in domains with graph-structured data, it addresses unique challenges, such as a binary objective function and an extremely sparse reward that is given by an automated theorem prover only after the complete loop invariant is proposed. We evaluate CODE2INV on a suite of 133 benchmark problems and compare it to three state-of-the-art systems. It solves 106 problems compared to 73 by a stochastic search-based system, 77 by a heuristic search-based system, and 100 by a decision tree learning-based system. Moreover, the strategy learned can be generalized to new programs: compared to solving new instances from scratch, the pre-trained agent is more sample efficient in finding solutions. 1 Introduction The growing ubiquity and complexity of software has led to a dramatic increase in software bugs and security vulnerabilities that pose enormous costs and risks. Program verification technology enables programmers to prove the absence of such problems at compile-time before deploying their program. One of the main activities underlying this technology involves inferring a loop invariant a logical formula that constitutes an abstract specification of a loop for each loop in the program. Obtaining loop invariants enables a broad and deep range of correctness and security properties to be proven automatically by a variety of program verification tools spanning type checkers, static analyzers, and theorem provers. Notable examples include Microsoft Code Contracts for .NET programs [1] and the Verified Software Toolchain spanning C source code to machine language [2]. Many different approaches have been proposed in the literature to infer loop invariants. The problem is undecidable, however, and even practical instances are challenging, which greatly limits the benefits of program verification technology. Existing approaches suffer from key drawbacks: they are purely search-based, or they use hand-crafted features, or they are based on supervised learning. The performance of search-based approaches is greatly hindered by their inability to learn from past mistakes. Hand-crafted features limit the space of possible invariants, e.g., Garg et al. [3] is limited to features of the form x y c where c is a constant, and thus cannot handle invariants that involve x+y z for program variables x,y,z. Finally, obtaining ground truth solutions needed by supervised learning is hindered by the undecidability of the loop invariant generation problem. In this paper, we propose CODE2INV, an end-to-end learning-based approach to infer loop invariants. CODE2INV has the ability to automatically learn rich latent representations of desirable invariants, Both authors contributed equally to the paper. 32nd Conference on Neural Information Processing Systems (Neur IPS 2018), Montréal, Canada. and can avoid repeating similar mistakes. Furthermore, it leverages reinforcement learning to discover invariants by partial feedback from trial-and-error, without needing ground truth solutions for training. The design of CODE2INV is inspired by the reasoning exercised by human experts. Given a program, a human expert first maps the program to a well-organized structural representation, and then composes the loop invariant step by step. Based on such reasoning, different parts of the representation get highlighted at each step. To mimic this procedure, we utilize a graph neural network model (GNN) to construct the structural external memory representation of the program. The multi-step decision making is implemented by an autoregressive model, which queries the external memory using an attention mechanism. The decision at each step is a syntaxand semantics-guided decoder which generates subparts of the loop invariant. CODE2INV employs a reinforcement learning approach since it is computationally intensive to obtain ground truth solutions. Although reinforcement learning algorithms have shown remarkable success in domains like combinatorial optimization [4, 5] (see Section 6 for more discussion on related work), our setting differs in two crucial ways: first, it has a non-continuous objective function (i.e., a proposed loop invariant is correct or not); and second, the positive reward is extremely sparse and given only after the correct loop invariant is proposed, by an automated theorem prover [6]. We therefore model the policy learning as a multi-step decision making process: it provides a fine-grained reward at each step of building the loop invariant, followed by continuous feedback in the last step based on counterexamples collected by the agent itself during trial-and-error learning. We evaluate CODE2INV on a suite of 133 benchmark problems from recent works [3, 7, 8] and the 2017 Sy Gu S program synthesis competition [9]. We also compare it to three state-of-the-art systems: a stochastic search-based system C2I [10], a heuristic search-based system LOOPINVGEN [8], and and a decision tree learning-based system ICE-DT [3]. CODE2INV solves 106 problems, versus 73 by C2I, 77 by LOOPINVGEN, and 100 by ICE-DT. Moreover, CODE2INV exhibits better learning, making orders-of-magnitude fewer calls to the theorem prover than these systems. 2 Background We formally define the loop invariant inference and learning problems by introducing Hoare logic [11], which comprises a set of axioms and inference rules for proving program correctness assertions. Let P and Q denote predicates over program variables and let S denote a program. We say that Hoare triple {P} S {Q} is valid if whenever S begins executing in a state that satisfies P and finishes executing, then the resulting state satisfies Q. We call P and Q the pre-condition and post-condition respectively of S. Hoare rules allow to derive such triples inductively over the structure of S. The rule most relevant for our purpose is that for loops: P I (pre) {I B} S {I} (inv) (I B) Q (post) {P} while B do S {Q} Predicate I is called a loop invariant, an assertion that holds before and after each iteration, as shown in the premise of the rule. We can now formally state the loop invariant inference problem: Problem 1 (Loop Invariant Inference): Given a pre-condition P, a post-condition Q and a program S containing a single loop, can we find a predicate I such that {P} S {Q} is valid? Given a candidate loop invariant, it is straightforward for an automated theorem prover such as Z3 [6] to check whether the three conditions denoted pre, inv, and post in the premise of the above rule hold, and thereby prove the property asserted in the conclusion of the rule. If any of the three conditions fails to hold, the theorem prover returns a concrete counterexample witnessing the failure. The loop invariant inference problem is undecidable. Moreover, even seemingly simple instances are challenging, as we illustrate next using the program in Figure 1(a). The goal is to prove that assertion (y > 0) holds at the end of the program, for every input value of integer variable y. In this case, the pre-condition P is true since the input value of y is unconstrained, and the post-condition Q is (y>0), the assertion to be proven. Using predicate (x<0 y >0) as the loop invariant I suffices to prove the assertion, as shown in Figure 1(b). Notation φ[e/x] denotes the predicate φ with each occurrence of variable x replaced by expression e. This loop invariant is non-trivial to infer. The reasoning is simple in the case when the input value of y is non-negative, but far more subtle in the case when it is negative: regardless of how negative it is at the beginning, the loop will iterate at least as many times as to make it positive, thereby ensuring the desired assertion upon finishing. Indeed, a state-of-the-art loop invariant generator LOOPINVGEN [8] crashes on this problem instance after making 1,119 calls to Z3, whereas CODE2INV successfully generates it after only 26 such calls. x := 50; while (x<0) { x := x+y; y := y+1 } assert(y>0) (a) An example program. (b) A desirable loop invariant I is a predicate over x,y such that: ( true I[ 50/x] (pre) I x<0 I[(y+1)/y,(x+y)/x] (inv) I x 0 y >0 (post) (c) The desired loop invariant is (x<0 y >0). Figure 1: A program with a correctness assertion and a loop invariant that suffices to prove it. The central role played by loop invariants in program verification has led to a large body of work to automatically infer them. Many previous approaches are based on exhaustive bounded search using domainspecific heuristics and are thereby limited in applicability and scalability [7, 12 18]. A different strategy is followed by data-driven approaches proposed in recent years [3, 8, 10]. These methods speculatively guess likely invariants from program executions and check their validity. In [3], decision trees are used to learn loop invariants with simple linear features, e.g. a x+b y = 4) { 9 x++; 10 y++; 11 } 12 if (x < 0) y--; 13 } 14 } 15 assert( x < 4 || y > 2); 16 } Figure 2: An example from our benchmarks. * denotes nondeterministic choice. We start out by illustrating how a human expert might typically accomplish the task of inferring a loop invariant. Consider the example in Figure 2 chosen from our benchmarks. An expert usually starts by reading the assertion (line 15), which contains variables x and y, then determines the locations where these two variables are initialized, and then focuses on the locations where they are updated in the loop. Instead of reasoning about the entire assertion at once, an expert is likely to focus on updates to one variable at a time. This reasoning yields the observation that x is initialized to zero (line 2) and may get incremented in each iteration (line 5,9). Thus, the sub goal x < 4 may not always hold, given that the loop iterates non-deterministically. This in turn forces the other part y > 2 to be true when x >= 4 . The only way x can equal or exceed 4 is to execute the first if branch 4 times (line 4-6), during which y is set to 100. Now, a natural guess for the loop invariant is x < 4 || y >= 100 . The reason for guessing y >= 100 instead of y <= 100 is because part of the proof goal is y > 2 . However, this guess will be rejected by the theorem prover. This is because y might be decreased by an arbitrary number of times in the third if-branch (line 12), which happens when x is less than zero; to avoid that situation, x >= 0 should also be part of the loop invariant. Finally, we have the correct loop invariant: (x >= 0) && (x < 4 || y >= 100) , which suffices to prove the assertion. We observe that the entire reasoning process consists of three key components: 1) organize the program in a hierarchical-structured way rather than a sequence of tokens; 2) compose the loop invariant step by step; and 3) focus on a different part of the program at each step, depending on the inference logic, e.g., abduction and induction. 3.2 Programming the reasoning procedure with neural networks We propose to use a neural network to mimic the reasoning used by human experts as described above. The key idea is to replace the above three components with corresponding differentiable modules: a structured external memory representation which encodes the program; a multi-step autoregressive model for incremental loop invariant construction; and an attention component that mimics the varying focus in each step. Output Solution copy copy copy Figure 3: Overall framework of neuralizing loop invariant inference. As shown in Figure 3, these modules together build up the network that constructs loop invariants from programs, while being jointly trained with reinforcement learning described in Section 4. At each step, the neural network generates a predicate. Then, given the current generated partial tree, a Tree LSTM module summarizes what have been generated so far, and the summarization is used to read the memory using attention. Lastly, the summarization together with the read memory is fed into next time step. We next elaborate upon each of these three components. 3.2.1 Structured external memory The loop invariant is built within the given context of program. Thus it is natural to encode the program as an external memory module. However, in contrast to traditional memory networks [19, 20], where the memory slots are organized as a linear array, the information contained in a program has rich structure. A chain LSTM over program tokens can in principle capture such information but it is challenging for neural networks to understand with limited data. Inspired by Allamanis et al. [21], we instead use a graph-structured memory representation. Such a representation allows to capture rich semantic knowledge about the program such as its control-flow and data-flow. More concretely, we first convert a given program into static single assignment (SSA) form [22], and construct a control flow graph, each of whose nodes represents a single program statement. We then transform each node into an abstract syntax tree (AST) representing the corresponding statement. Thus a program can be represented by a graph G=(V,E), where V contains terminals and nonterminals of the ASTs, and E ={(e(i) x ,e(i) y ,e(i) t )}|E| i=1 is the set of edges. The directed edge (e(i) x ,e(i) y ,e(i) t ) starts from node e(i) x to e(i) y , with e(i) t {1,2,...,K} representing edge type. In our construction, the program graph contains 3 different edge types (and 6 after adding reversed edges). SSA node Non-terminals Terminals Source Code Graph Representation Neural Graph Embedding Variable link AST edge Control flow Vector representation message passing operator Structured External Memory Figure 4: Diagram for source code graph as external structured memory. We convert a given program into a graph G, where nodes correspond to syntax elements, and edges indicate the control flow, syntax tree structure, or variable linking. We use embedding neural network to get structured memory f(G). To convert the graph into vector representation, we follow the general message passing operator introduced in graph neural network (GNN) [23] and its variants [21, 24, 25]. Specifically, the graph network will associate each node v V with an embedding vector µv Rd. The embedding is updated iteratively using the general neighborhood embedding as follows: µ(l+1) v =h({µ(l) u }u N k(v),k {1,2,...,K}) (1) Here h( ) is a nonlinear function that aggregates the neighborhood information to update the embedding. N k(v) is the set of neighbor nodes connected to v with edge type k, i.e., N k(v)={u|(u,v,k) E}. Such process will be repeated for L steps, and the node embedding µv is set to µ(L) v , v V . Our parameterization takes the edge types into account. The specific parameterization used is shown below: µ(l+1),k v = σ(P u N k(v)W2µ(l) u ), k {1,2,...,K} (2) µ(l+1) v = σ(W3[µ(l+1),1 v ,µ(l+1),2 v ,...,µ(l+1),K v ]) (3) with the boundary case µ(0) v =W1xv. Here xv represents the syntax information of node v, such as token or constant value in the program. Matrices W1,2,3 are learnable model parameters, and σ is some nonlinear activation function. Figure 4 shows the construction of graph structured memory using iterative message passing operator in Eq (1). f(G)={µv}v V denotes the structured memory. 3.2.2 Multi-step decision making process A loop invariant itself is a mini-program that contains expressions and logical operations. Without loss of generality, we define the loop invariant to be a tree T , in a form with conjunctions of disjunctions: T =(T1 || T2...) && (Tt+1 || Tt+2...) && ... (...TT 1|| TT ) (4) Eachsubtree Tt isasimplelogicexpression(i.e., x < y * 2 + 10 - z). Giventhisrepresentationform, it is natural to use Markov decision process (MDP) to model this problem, where the corresponding T-step finite horizon MDP is defined as MG = (s1,a1,r1,s2,a2,...,s T ). Here st,at,rt represent the state, action and reward at time step t=1,...,T 1, respectively. Here we describe the state and action used in the inference model, and describe the design of reward and termination in Section 4. action: As defined in Eq (4), a loop invariant tree T consists of multiple subtrees {Tt}. Thus we model the action at time step t as at =(opt,Tt), where opt can either be || or &&. That is to say, at each time step, the agent first decides whether to attach the subexpression Tt to an existing disjunction, or create a new disjunction and add it to the list of conjunctions. We use T (1, the action at should be conditioned on graph memory, as well as the partial tree generated so far. Thus st =(G,T (= -1 && n >= 1 Figure 6: (a) and (b) are verification costs of pre-trained model and untrained model; (c) and (d) are attention highlights for two example programs. helps to reduce the verification cost modestly. CODE2INV achieves the best performance with both components enabled the configuration used in other parts of our evaluation. Additionally, to test the effectiveness of neural graph embedding, we study a simpler encoding, that is, viewing a program as a sequence of tokens and encoding the sequence using an LSTM. The performance of this setup is shown in the last row of Table 1. With a simple LSTM embedding, CODE2INV solves 13 fewer instances and, moreover, requires significantly more parameter updates. Table 1: Ablation study for different configurations of CODE2INV. configuration #solved instances max #Z3 queries max #parameter updates without CE, without attention 91 415K 441K without CE, with attention 94 147K 162K with CE, without attention 95 392 337K with CE, with attention 106 276 290K LSTM embedding + CE + attention 93 32 661K 5.3 Boosting the search with pre-training We next address the question: given an agent that is pre-trained on programs Ptrain ={pi} P, can the agent solve new programs Ptest ={ pi} P faster than solving from scratch? We prepare the training and testing data as follows. We take the programs solved by CODE2INV as the initial set and augment it by creating 100 variations for each of them by introducing confounding variables and statements in such a way that any valid loop invariant for the original program is still valid. Further details are provided in Appendix ??. Finally, 90% of them serves as Ptrain, and the rest are used for Ptest. After pre-training the agent on Ptrain for 50 epochs, we save the model and then reuse it for fine tuning (or active search [4]), i.e., the agent continues the trial-and-error reinforcement learning, on Ptest. Figure 6a and Figure 6b compare the verification costs between the pre-trained model and untrained model on datasets augmented with 1 and 5 confounding variables, respectively. We observe that, on one hand, the pre-trained model has a clear advantage over the untrained model on either dataset; but on the other hand, this gap reduces when more confounding variables are introduced. This result suggests an interesting future research direction: how to design a learning agent to effectively figure out loop invariant related variables from a potentially large number of confounding variables. 5.4 Attention visualization Figure 6c and 6d show the attention highlights for two example programs. The original highlights are provided on the program graph representation described in Section 3.2.1. We manually converted the graphs back to source code for clarity. Figure 6c shows an interesting example for which CODE2INV learns a strategy of showing the assertion is actually not reachable, and thus holds trivially. Figure 6d shows another interesting example for which CODE2INV performs a form of abductive reasoning. 5.5 Discussion of limitations We conclude our study with a discussion of limitations. For most of the instances that CODE2INV fails to solve, we observe that the loop invariant can be expressed in a compact disjunctive normal form (DNF) representation, which is more suited for the decision tree learning approach with hand-crafted features. However, CODE2INV is designed to produce loop invariants in the conjunctive normal form (CNF). The reduction of loop invariants from DNF to CNF could incur an exponential blowup in size. An interesting future research direction concerns designing a learning agent that can flexibly switch between these two forms. 6 Related Work We survey work in program synthesis, program learning, learning loop invariants, and learning combinatorial optimizations. Program synthesis. Automatically synthesizing a program from its specification has been a key challenge problem since Manna and Waldinger s work [32]. In this context, syntax-guided synthesis (Sy Gu S) [9] was proposed as a common format to express these problems. Besides several implementations of Sy Gu S solvers [9, 33 35], a number of probabilistic techniques have been proposed to model syntactic aspects of programs and to accelerate synthesis[36 38]. While logical program synthesis approaches guarantee semantic correctness, they are chiefly limited by their scalability and requirement of rigorous specifications. Program learning. There have been several attempts to learn general programs using neural networks. One large class of projects includes those attempting to use neural networks to accelerate the discovery of conventional programs [39 42]. Most existing works only consider specifications which are in the form input-output examples, where weak supervision [43 45] or more fine grained trace information is provided to help training. In our setting, there is no supervision for the ground truth loop invariant, and the agent needs to be able to compose a loop invariant purely from trial-and-error. Drawing inspiration from both programming languages and embedding methods, we build up an efficient learning agent that can perform end-to-end reasoning, in a way that mimics human experts. Learning program loop invariants. Our work is closely related to recent work on learning loop invariants from either labeled ground truth [46] or active interactions with human experts [47]. Brockschmidt et al. [46] learn shape invariants for data structures (e.g. linked lists or trees). Their approach first extracts features using n-gram and reachability statistics over the program s heap graph and then applies supervised learning to train a neural network to map features to shape invariants. In contrast, we are concerned with general loop invariant generation, and our approach employs graph embedding directly on the program s AST and learns a generation policy without using ground truth as supervision. Bounov et al. [47] propose inferring loop invariants through gamification and crowdsourcing, which relieves the need for expertise in software verification, but still requires significant human effort. In contrast, an automated theorem prover suffices for our approach. Learning combinatorial optimizations. Our work is also related to recent advances in combinatorial optimization using machine learning [4, 5, 48, 49]. However, as elaborated in Section 4, the problem we study is significantly more difficult, in the sense that the objective function is non-smooth (binary objective), and the positive reward is extremely sparse due to the exponentially growing size of the search space with respect to program size. 7 Conclusion We studied the problem of learning loop invariants for program verification. Our proposed end-to-end reasoning framework learns to compose the solution automatically without any supervision. It solves a comparable number of benchmarks as the state-of-the-art solvers while requiring much fewer queries to a theorem prover. Moreover, after being pre-trained, it can generalize the strategy to new instances much faster than starting from scratch. In the future, we plan to extend the framework to discover loop invariants for larger programs which present more confounding variables, as well as to discover other kinds of program correctness properties such as ranking functions for proving program termination [50] and separation predicates for proving correctness of pointer-manipulating programs [51]. Acknowledgments. We thank the anonymous reviewers for insightful comments. We thank Ningning Xie for useful feedback. This research was supported in part by DARPA FA8750-15-2-0009, NSF (CCF-1526270, IIS-1350983, IIS-1639792, CNS-1704701) and ONR N00014-15-1-2340. [1] Manuel Fahndrich and Francesco Logozzo. Static contract checking with abstract interpretation. In Proceedings of the 2010 International Conference on Formal Verification of Object-Oriented Software, 2010. [2] Andrew W. Appel. Verified Software Toolchain. In Proceedings of the 20th European Symposium on Programming (ESOP), 2011. [3] Pranav Garg, Daniel Neider, P. Madhusudan, and Dan Roth. Learning invariants using decision trees and implication counterexamples. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL), 2016. [4] Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, and Samy Bengio. Neural combinatorial optimization with reinforcement learning. Co RR, abs/1611.09940, 2016. [5] Elias B. Khalil, Hanjun Dai, Yuyu Zhang, Bistra Dilkina, and Le Song. Learning combinatorial optimization algorithms over graphs. In Proceedings of the Conference on Neural Information Processing Systems (NIPS), 2017. [6] Leonardo de Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 2008. [7] Isil Dillig, Thomas Dillig, Boyang Li, and Ken Mc Millan. Inductive invariant generation via abductive inference. In Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA), 2013. [8] Saswat Padhi, Rahul Sharma, and Todd Millstein. Data-driven precondition inference with learned features. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI), 2016. [9] Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis. In Proceedings of Formal Methods in Computer-Aided Design (FMCAD), 2013. [10] Rahul Sharma and Alex Aiken. From invariant checking to invariant inference using randomized search. In Proceedings of the International Conference on Computer Aided Verification (CAV), 2014. [11] C. A. R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, 12(10), October 1969. [12] Michael A. Colón, Sriram Sankaranarayanan, and Henny B. Sipma. Linear invariant generation using nonlinear constraint solving. In Proceedings of the International Conference on Computer Aided Verification (CAV), 2003. [13] Sriram Sankaranarayanan, Henny B. Sipma, and Zohar Manna. Non-linear loop invariant generation using Gröbner bases. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL), 2004. [14] Sumit Gulwani and Nebojsa Jojic. Program verification as probabilistic inference. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL), 2007. [15] Rahul Sharma, Saurabh Gupta, Bharath Hariharan, Alex Aiken, Percy Liang, and Aditya V. Nori. A data driven approach for algebraic loop invariants. In Proceedings of the European Symposium on Programming (ESOP), 2013. [16] Rahul Sharma, Isil Dillig, Thomas Dillig, and Alex Aiken. Simplifying loop invariant generation using splitter predicates. In Proceedings of the International Conference on Computer Aided Verification (CAV), 2011. [17] Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. Recursive program synthesis. In Proceedings of the International Conference on Computer Aided Verification (CAV), 2013. [18] Pranav Garg, Christof Löding, P. Madhusudan, and Daniel Neider. Ice: A robust framework for learning invariants. In Proceedings of the International Conference on Computer Aided Verification (CAV), 2014. [19] Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, et al. End-to-end memory networks. In Proceedings of the Conference on Neural Information Processing Systems (NIPS), 2015. [20] Alexander Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, and Jason Weston. Key-value memory networks for directly reading documents. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016. [21] Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. Learning to represent programs with graphs. In Proceedings of the International Conference on Learning Representations (ICLR), 2018. [22] Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst., 13(4), 1991. [23] Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61 80, 2009. [24] David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alán Aspuru Guzik, and Ryan P Adams. Convolutional networks on graphs for learning molecular fingerprints. In Proceedings of the Conference on Neural Information Processing Systems (NIPS), 2015. [25] Hanjun Dai, Bo Dai, and Le Song. Discriminative embeddings of latent variable models for structured data. In Proceedings of the International Conference on Machine Learning (ICML), 2016. [26] Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, and Pushmeet Kohli. Neuro-symbolic program synthesis. ar Xiv preprint ar Xiv:1611.01855, 2016. [27] Matt J Kusner, Brooks Paige, and José Miguel Hernández-Lobato. Grammar variational autoencoder. In Proceedings of the International Conference on Machine Learning (ICML), 2017. [28] Hanjun Dai, Yingtao Tian, Bo Dai, Steven Skiena, and Le Song. Syntax-directed variational autoencoder for structured data. In Proceedings of the International Conference on Learning Representations (ICLR), 2018. [29] Kai Sheng Tai, Richard Socher, and Christopher D Manning. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the Association for Computational Linguistics (ACL), 2015. [30] David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering the game of Go without human knowledge. Nature, 550(7676):354 359, 2017. [31] Sy Gu S Competition, 2017. http://sygus.seas.upenn.edu/Sy Gu S-COMP2017.html. [32] Zohar Manna and Richard J. Waldinger. Toward automatic program synthesis. In Communications of the ACM, 1971. [33] Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. Synthesis of loop-free programs. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI), 2011. [34] Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. Scaling enumerative program synthesis via divide and conquer. 2017. [35] Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic superoptimization. 2013. ISBN 978-1-45031870-9. [36] Pavol Bielik, Veselin Raychev, and Martin Vechev. Phog: Probabilistic model for code. In Proceedings of the International Conference on Machine Learning (ICML), 2016. [37] C.J. Maddison and D. Tarlow. Structured generative models of natural source code. In Proceedings of the International Conference on Machine Learning (ICML), 2014. [38] Anh Tuan Nguyen and Tien N. Nguyen. Graph-based statistical language model for code. In Proceedings of the International Conference on Software Engineering (ICSE), 2015. [39] M. Balog, A. L. Gaunt, M. Brockschmidt, S. Nowozin, and D. Tarlow. Deepcoder: Learning to write programs. In Proceedings of the International Conference on Learning Representations (ICLR), 2017. [40] Vijayaraghavan Murali, Letao Qi, Swarat Chaudhuri, and Chris Jermaine. Neural sketch learning for conditional program generation. In Proceedings of the International Conference on Learning Representations (ICLR), 2018. [41] Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel rahman Mohamed, and Pushmeet Kohli. Robustfill: Neural program learning under noisy i/o. In Proceedings of the International Conference on Machine Learning (ICML), 2017. [42] Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, and Pushmeet Kohli. Neuro-symbolic program synthesis. Proceedings of the International Conference on Learning Representations (ICLR), 2016. [43] Chen Liang, Jonathan Berant, Quoc Le, Kenneth D Forbus, and Ni Lao. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. ar Xiv preprint ar Xiv:1611.00020, 2016. [44] Xinyun Chen, Chang Liu, and Dawn Song. Towards synthesizing complex programs from input-output examples. In International Conference on Learning Representations, 2018. [45] Rudy Bunel, Matthew Hausknecht, Jacob Devlin, Rishabh Singh, and Pushmeet Kohli. Leveraging grammar and reinforcement learning for neural program synthesis. In International Conference on Learning Representations, 2018. [46] Marc Brockschmidt, Yuxin Chen, Pushmeet Kohli, Siddharth Krishna, and Daniel Tarlow. Learning shape analysis. In Proceedings of the Static Analysis Symposium (SAS), 2017. [47] Dimitar Bounov, Anthony De Rossi, Massimiliano Menarini, William G. Griswold, and Sorin Lerner. Inferring loop invariants through gamification. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI 18, 2018. [48] Elias Boutros Khalil, Pierre Le Bodic, Le Song, George L Nemhauser, and Bistra N Dilkina. Learning to branch in mixed integer programming. 2016. [49] Daniel Selsam, Matthew Lamm, Benedikt Bunz, Percy Liang, Leonardo de Moura, and David L Dill. Learning a sat solver from single-bit supervision. ar Xiv preprint ar Xiv:1802.03685, 2018. [50] Byron Cook, Andreas Podelski, and Andrey Rybalchenko. Proving program termination. Communications of the ACM, 54(5), 2011. [51] John C. Reynolds. Separation logic: A logic for shared mutable data structures. In Proceedings of the IEEE Symposium on Logic in Computer Science (LICS), 2002.