# generative_code_modeling_with_graphs__25f010dc.pdf Published as a conference paper at ICLR 2019 GENERATIVE CODE MODELING WITH GRAPHS Marc Brockschmidt, Miltiadis Allamanis, Alexander Gaunt Microsoft Research Cambridge, UK {mabrocks,miallama,algaunt}@microsoft.com Oleksandr Polozov Microsoft Research Redmond, WA, USA polozov@microsoft.com Generative models for source code are an interesting structured prediction problem, requiring to reason about both hard syntactic and semantic constraints as well as about natural, likely programs. We present a novel model for this problem that uses a graph to represent the intermediate state of the generated output. Our model generates code by interleaving grammar-driven expansion steps with graph augmentation and neural message passing steps. An experimental evaluation shows that our new model can generate semantically meaningful expressions, outperforming a range of strong baselines. 1 INTRODUCTION Learning to understand and generate programs is an important building block for procedural artificial intelligence and more intelligent software engineering tools. It is also an interesting task in the research of structured prediction methods: while imbued with formal semantics and strict syntactic rules, natural source code carries aspects of natural languages, since it acts as a means of communicating intent among developers. Early works in the area have shown that approaches from natural language processing can be applied successfully to source code (Hindle et al., 2012), whereas the programming languages community has had successes in focusing exclusively on formal semantics. More recently, methods handling both modalities (i.e., the formal and natural language aspects) have shown successes on important software engineering tasks (Raychev et al., 2015; Bichsel et al., 2016; Allamanis et al., 2018b) and semantic parsing (Yin & Neubig, 2017; Rabinovich et al., 2017). However, current generative models of source code mostly focus on only one of these modalities at a time. For example, program synthesis tools based on enumeration and deduction (Solar-Lezama, 2008; Polozov & Gulwani, 2015; Feser et al., 2015; Feng et al., 2018) are successful at generating programs that satisfy some (usually incomplete) formal specification but are often obviously wrong on manual inspection, as they cannot distinguish unlikely from likely, natural programs. On the other hand, learned code models have succeeded in generating realistic-looking programs (Maddison & Tarlow, 2014; Bielik et al., 2016; Parisotto et al., 2017; Rabinovich et al., 2017; Yin & Neubig, 2017). However, these programs often fail to be semantically relevant, for example because variables are not used consistently. In this work, we try to overcome these challenges for generative code models and present a general method for generative models that can incorporate structured information that is deterministically available at generation time. We focus our attention on generating source code and follow the ideas of program graphs (Allamanis et al., 2018b) that have been shown to learn semantically meaningful representations of (pre-existing) programs. To achieve this, we lift grammar-based tree decoder models into the graph setting, where the diverse relationships between various elements of the generated code can be modeled. For this, the syntax tree under generation is augmented with additional edges denoting known relationships (e.g., last use of variables). We then interleave the steps of the generative procedure with neural message passing (Gilmer et al., 2017) to compute more precise representations of the intermediate states of the program generation. This is fundamentally different from sequential generative models of graphs (Li et al., 2018; Samanta et al., 2018), which aim to generate all edges and nodes, whereas our graphs are deterministic augmentations of generated trees. To summarize, we present a) a general graph-based generative procedure for highly structured objects, incorporating rich structural information; b) Expr Gen, a new code generation task focused on Published as a conference paper at ICLR 2019 Algorithm 1 Pseudocode for Expand Input: Context c, partial AST a, node v to expand 1: hv get Representation(c, a, v) 2: rhs pick Production(v, hv) 3: for child node type ℓ rhs do 4: (a, u) insert Child(a, ℓ) 5: if ℓis nonterminal type then 6: a Expand(c, a, u) 7: return a int il Offset Idx = Array.Index Of(sorted ILOffsets, map.ILOffset); int next ILOffset Idx = il Offset Idx + 1; int next Map ILOffset = next ILOffset Idx < sorted ILOffsets.Length ? sorted ILOffsets[next ILOffset Idx] : int.Max Value; Figure 1: Example for Expr Gen, target expression to be generated is marked . Taken from Benchmark Dot Net, lightly edited for formatting. generating small, but semantically complex expressions conditioned on source code context; and c) a comprehensive experimental evaluation of our generative procedure and a range of baseline methods from the literature. 2 BACKGROUND & TASK The most general form of the code generation task is to produce a (partial) program in a programming language given some context information c. This context information can be natural language (as in, e.g., semantic parsing), input-output examples (e.g., inductive program synthesis), partial program sketches, etc. Early methods generate source code as a sequence of tokens (Hindle et al., 2012; Hellendoorn & Devanbu, 2017) and sometimes fail to produce syntactically correct code. More recent models are sidestepping this issue by using the target language s grammar to generate abstract syntax trees (ASTs) (Maddison & Tarlow, 2014; Bielik et al., 2016; Parisotto et al., 2017; Yin & Neubig, 2017; Rabinovich et al., 2017), which are syntactically correct by construction. In this work, we follow the AST generation approach. The key idea is to construct the AST a sequentially, by expanding one node at a time using production rules from the underlying programming language grammar. This simplifies the code generation task to a sequence of classification problems, in which an appropriate production rule has to be chosen based on the context information and the partial AST generated so far. In this work, we simplify the problem further similar to Maddison & Tarlow (2014); Bielik et al. (2016) by fixing the order of the sequence to always expand the left-most, bottom-most nonterminal node. Alg. 1 illustrates the common structure of AST-generating models. Then, the probability of generating a given AST a given some context c is p(a | c) = Y t p(at | c, a i will not match the equivalent i < j. Results We show the results of our evaluation in Tab. 1. Overall, the graph encoder architecture seems to be best-suited for this task. All models learn to generate syntactically valid code (which is relatively simple in our domain). However, the different encoder models perform very differently on semantic measures such as well-typedness and the retrieval of the ground truth expression. Most of the type errors are due to usage of an UNK literal (for example, the G NAG model only has 4% type error when filtering out such unknown literals). The results show a clear trend that correlates better semantic results with the amount of information about the partially generated programs employed by the generative models. Transferring a trained model to unseen projects with a new project-specific vocabulary substantially worsens results, as expected. Overall, our NAG model, combining and adding additional signal sources, seems to perform best on most measures, and seems to be leastimpacted by the transfer. Published as a conference paper at ICLR 2019 int meth Param Count = 0; if (param Count > 0) { IParameter Type Information[] module Param Arr = Get Param Type Informations(Dummy.Signature, param Count); meth Param Count = module Param Arr.Length; } if ( param Count > meth Param Count ) { IParameter Type Information[] module Param Arr = Get Param Type Informations(Dummy.Signature, param Count - meth Param Count); } G NAG: param Count > meth Param Count (34.4%) param Count == meth Param Count (11.4%) param Count < meth Param Count (10.0%) G ASN: param Count == 0 (12.7%) param Count < 0 (11.5%) param Count > 0 (8.0%) public static String URIto Path(String uri) { if (System.Text.Regular Expressions .Regex.Is Match(uri, " file:\\\\[a-z,A-Z]:")) { return uri.Substring(6); } if ( uri.Starts With(@"file:") ) { return uri.Substring(5); } return uri; } G NAG: uri.Contains(UNK_STRING_LITERAL) (32.4%) uri.Starts With(UNK_STRING_LITERAL) (29.2%) uri.Has Value() (7.7%) G Syn: uri == UNK_STRING_LITERAL (26.4%) uri == "" (8.5%) uri.Starts With(UNK_STRING_LITERAL) (6.7%) Figure 3: Two lightly edited examples from our test set and expressions predicted by different models. More examples can be found in the supplementary material. 5.2 QUALITATIVE EVALUATION As the results in the previous section suggest, the proposed Expr Gen task is hard even for the strongest models we evaluated, achieving no more than 50% accuracy on the top prediction. It is also unsolvable for classical logico-deductive program synthesis systems, as the provided code context does not form a precise specification. However, we do know that most instances of the task are (easily) solvable for professional software developers, and thus believe that machine learning systems can have considerable success on the task. Fig. 3 shows two (abbreviated) samples from our test set, together with the predictions made by the two strongest models we evaluated. In the first example, we can see that the G NAG model correctly identifies that the relationship between param Count and meth Param Count is important (as they appear together in the blocked guarded by the expression to generate), and thus generates comparison expressions between the two variables. The G ASN model lacks the ability to recognize that param Count (or any variable) was already used and thus fails to insert both relevant variables. We found this to be a common failure, often leading to suggestions using only one variable (possibly repeatedly). In the second example, both G NAG and G Syn have learned the common if (var.Starts With(...)) { ... var.Substring(num) ... } pattern, but of course fail to produce the correct string literal in the condition. We show results for all of our models for these examples, as well as for as additional examples, in the supplementary material B. 6 DISCUSSION & CONCLUSIONS We presented a generative code model that leverages known semantics of partially generated programs to direct the generative procedure. The key idea is to augment partial programs to obtain a graph, and then use graph neural networks to compute a precise representation for the partial program. This representation then helps to better guide the remainder of the generative procedure. We have shown that this approach can be used to generate small but semantically interesting expressions from very imprecise context information. The presented model could be useful in program repair scenarios (where repair proposals need to be scored, based on their context) or in the code review setting (where it could highlight very unlikely expressions). We also believe that similar models could have applications in related domains, such as semantic parsing, neural program synthesis and text generation. Published as a conference paper at ICLR 2019 Miltiadis Allamanis, Earl T Barr, Premkumar Devanbu, and Charles Sutton. A survey of machine learning for big code and naturalness. ACM Computing Surveys, 2018a. Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. Learning to represent programs with graphs. In International Conference on Learning Representations (ICLR), 2018b. Matthew Amodio, Swarat Chaudhuri, and Thomas W. Reps. Neural attribute machines for program generation. ar Xiv preprint ar Xiv:1705.09231, 2017. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. In International Conference on Learning Representations (ICLR), 2014. Benjamin Bichsel, Veselin Raychev, Petar Tsankov, and Martin Vechev. Statistical deobfuscation of android applications. In Conference on Computer and Communications Security (CCS), 2016. Pavol Bielik, Veselin Raychev, and Martin Vechev. PHOG: probabilistic model for code. In International Conference on Machine Learning (ICML), 2016. Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. On the properties of neural machine translation: Encoder decoder approaches. Syntax, Semantics and Structure in Statistical Translation, 2014. Yu Feng, Ruben Martins, Osbert Bastani, and Isil Dillig. Program synthesis using conflict-driven learning. In Programming Languages Design and Implementation (PLDI), 2018. John K. Feser, Swarat Chaudhuri, and Isil Dillig. Synthesizing data structure transformations from input-output examples. In Programming Languages Design and Implementation (PLDI), 2015. Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. Neural message passing for quantum chemistry. In International Conference on Machine Learning (ICML), 2017. Vincent J. Hellendoorn and Premkumar Devanbu. Are deep neural networks the best choice for modeling source code? In Foundations of Software Engineering (FSE), 2017. Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. On the naturalness of software. In International Conference on Software Engineering (ICSE), 2012. Donald E. Knuth. Semantics of context-free languages. Mathemtical Systems Theory, 2(2):127 145, 1967. Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. Gated graph sequence neural networks. In International Conference on Learning Representations (ICLR), 2016. Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, and Peter Battaglia. Learning deep generative models of graphs. Co RR, abs/1803.03324, 2018. Cristina V Lopes, Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. DéjàVu: a map of code duplicates on Git Hub. In Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), 2017. Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention-based neural machine translation. In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015. Chris J Maddison and Daniel Tarlow. Structured generative models of natural source code. In International Conference on Machine Learning (ICML), 2014. Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, and Pushmeet Kohli. Neuro-symbolic program synthesis. In International Conference on Learning Representations (ICLR), 2017. Oleksandr Polozov and Sumit Gulwani. Flash Meta: a framework for inductive program synthesis. In Object Oriented Programming, Systems, Languages, and Applications (OOPSLA), 2015. Maxim Rabinovich, Mitchell Stern, and Dan Klein. Abstract syntax networks for code generation and semantic parsing. In Annual Meeting of the Association for Computational Linguistics (ACL), 2017. Veselin Raychev, Martin Vechev, and Eran Yahav. Code completion with statistical language models. In Programming Languages Design and Implementation (PLDI), 2014. Published as a conference paper at ICLR 2019 Veselin Raychev, Martin Vechev, and Andreas Krause. Predicting program properties from Big Code. In Principles of Programming Languages (POPL), 2015. Bidisha Samanta, Abir De, Niloy Ganguly, and Manuel Gomez-Rodriguez. Designing random graph models using variational autoencoders with applications to chemical design. Co RR, abs/1802.05283, 2018. Armando Solar-Lezama. Program synthesis by sketching. University of California, Berkeley, 2008. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. Pointer networks. In Advances in Neural Information Processing Systems, 2015. Pengcheng Yin and Graham Neubig. A syntactic neural model for general-purpose code generation. In Annual Meeting of the Association for Computational Linguistics (ACL), 2017. Published as a conference paper at ICLR 2019 A DATASET SAMPLES Below we list some sample snippets from the training set for our Expr Gen task. The highlighted expressions are to be generated. for (int i=0; i < 3*time Span Units + 1 ; ++i) { consolidator.Update(new Trade Bar { Time = ref Date Time }); if (i < time Span Units) { // before initial consolidation happens Assert.Is Null(consolidated); } else { Assert.Is Not Null(consolidated); if ( i % time Span Units == 0 ) { // i = 3, 6, 9 Assert.Are Equal(ref Date Time.Add Minutes(-time Span Units), consolidated.Time); } } ref Date Time = ref Date Time.Add Minutes(1); } Figure 4: Sample snippet from the Lean project. Formatting has been modified. var words = (from word in phrase.Split( ) where word.Length > 0 select word.To Lower()).To Array(); Figure 5: Sample snippet from the Bot Builder project. Formatting has been modified. _has Handle = _mutex.Wait One( time Out < 0 ? Timeout.Infinite : time Out, exit Context: false); Figure 6: Sample snippet from the Chocolatey project. Formatting has been modified. public static T retry(int number Of Tries, Func function, int wait Duration Milliseconds = 100, int increase Retry By Milliseconds = 0) { if (function == null) return default(T); if (number Of Tries == 0) throw new Application Exception("You must specify a number" + " of retries greater than zero."); var return Value = default(T); var debugging = log_is_in_debug_mode(); var log Location = Chocolatey Loggers.Normal; for (int i = 1; i <= number Of Tries ; i++) { Figure 7: Sample snippet from the Chocolatey project. Formatting has been modified and the snippet has been abbreviated. Published as a conference paper at ICLR 2019 while ( count >= start Index ) { c = s[count]; if ( c != && c != n ) break; count--; } Figure 8: Samples snippet in the Common Mark.NET project. Formatting has been modified. private string Get Resource For Time Span(Time Unit unit, int count) { var resource Key = Resource Keys.Time Span Humanize.Get Resource Key(unit, count); return count == 1 ? Format(resource Key) : Format(resource Key, count); } Figure 9: Sample snippet from the Humanizer project. Formatting has been modified. var index Of Equals = segment.Index Of( = ) ; if ( index Of Equals == -1 ) { var decoded = Url Decode(segment, encoding); return new Key Value Pair(decoded, decoded); } Figure 10: Samples snippet from the Nancy project. Formatting has been modified. private bool Resolve Writable Override(bool writable) { if (!Writable && writable) throw new Storage Invalid Operation Exception("Cannot open writable storage" + " in readonly storage."); bool open Writable = Writable; if ( open Writable && !writable ) open Writable = writable; return open Writable; } Figure 11: Sample snippet from the Open Live Writer project. Formatting has been modified. char c = html[j]; if ( c == ; || (!(c >= a && c <= z ) && !(c >= A && c <= Z ) && !(c >= 0 && c <= 9 )) ) { Figure 12: Sample snippet from the Open Live Writer project. Formatting has been modified. Published as a conference paper at ICLR 2019 string entity Ref = html.Substring(i + 1, j - (i + 1)) ; Figure 13: Sample snippet from the Open Live Writer project. Formatting has been modified. B SAMPLE GENERATIONS On the following pages, we list some sample snippets from the test set for our Expr Gen task, together with suggestions produced by different models. The highlighted expressions are the ground truth expression that should be generated. Published as a conference paper at ICLR 2019 if (context.Context == _MARKUP_CONTEXT_TYPE.CONTEXT_TYPE_Text && !String.Is Null Or Empty(text)) { idx = original Text.Index Of(text) ; if (idx == 0) { // Drop this portion from the expected string original Text = original Text.Substring(text.Length); // Update the current pointer begin Damage Pointer.Move To Pointer(current Range.End); } else if (idx > 0 && original Text.Substring(0, idx) .Replace("\r\n", string.Empty).Length == 0) { // Drop this portion from the expected string original Text = original Text.Substring(text.Length + idx); // Update the current pointer begin Damage Pointer.Move To Pointer(current Range.End); } else { return false; } } Sample snippet from Open Live Writer. The following suggestions were made: Seq Seq: UNK_TOKEN[i] (0.6%) input[input Offset + 1] (0.3%) UNK_TOKEN & UNK_NUM_LITERAL (0.3%) Marshal Url Supported.Index Of(UNK_CHAR_LITERAL) (0.9%) Is Edit Field Selected.Index Of(UNK_CHAR_LITERAL) (0.8%) marshal Url Supported.Index Of(UNK_CHAR_LITERAL) (0.7%) UNK_TOKEN.Index Of(UNK_CHAR_LITERAL) (21.6%) UNK_TOKEN.Last Index Of(UNK_CHAR_LITERAL) (14.9%) UNK_TOKEN.Get Hash Code() (8.1%) UNK_CHAR_LITERAL.Index Of(UNK_CHAR_LITERAL) (8.1%) UNK_CHAR_LITERAL.Index Of(original Text) (8.1%) original Text.Index Of(UNK_CHAR_LITERAL) (8.1%) original Text.Get Hash Code() (37.8%) original Text.Index Of(UNK_CHAR_LITERAL) (14.8%) original Text.Last Index Of(UNK_CHAR_LITERAL) (6.2%) text.Index Of(UNK_CHAR_LITERAL) (20.9%) text.Last Index Of(UNK_CHAR_LITERAL) (12.4%) original Text.Index Of(UNK_CHAR_LITERAL) (11.6%) original Text.Index Of(UNK_CHAR_LITERAL) (32.8%) original Text.Last Index Of(UNK_CHAR_LITERAL) (12.4%) original Text.Index Of(text) (8.7%) Published as a conference paper at ICLR 2019 caret Pos--; if (caret Pos < 0) { caret Pos = 0; } int len = input String.Length; if (caret Pos >= len) { caret Pos = len - 1 ; } Sample snippet from acat. The following suggestions were made: Seq Seq: UNK_TOKEN+1 (2.1%) UNK_TOKEN+UNK_TOKEN] (1.8%) UNK_TOKEN.Index Of(UNK_CHAR_LITERAL) (1.3%) word To Replace - 1 (3.2%) insert Or Replace Offset - 1 (2.9%) input String - 1 (1.9%) len + 1 (35.6%) len - 1 (11.3%) len >> UNK_NUM_LITERAL (3.5%) len + len (24.9%) len - len (10.7%) 1 + len (3.7%) len + 1 (22.8%) len - 1 (10.8%) len + len (10.3%) len + 1 (13.7%) len - 1 (11.5%) len - len (11.0%) len++ (33.6%) len-1 (21.9%) len+1 (14.6%) Published as a conference paper at ICLR 2019 public static String URIto Path(String uri) { if (System.Text.Regular Expressions .Regex.Is Match(uri, " file:\\\\[a-z,A-Z]:")) { return uri.Substring(6); } if ( uri.Starts With(@"file:") ) { return uri.Substring(5); } return uri; } Sample snippet from acat. The following suggestions were made: Seq Seq: !UNK_TOKEN (11.1%) UNK_TOKEN == 0 (3.6%) UNK_TOKEN != 0 (3.4%) !uri (7.6%) !My Videos (4.7%) !My Documents (4.7%) action == UNK_STRING_LITERAL (22.6%) label == UNK_STRING_LITERAL (14.8%) file.Contains(UNK_STRING_LITERAL) (4.6%) uri == uri (7.4%) uri.Starts With(uri) (5.5%) uri.Contains(uri) (4.3%) uri == UNK_STRING_LITERAL (11.7%) uri.Contains(UNK_STRING_LITERAL) (11.7%) uri.Starts With(UNK_STRING_LITERAL) (8.3%) uri == UNK_STRING_LITERAL (26.4%) uri == "" (8.5%) uri.Starts With(UNK_STRING_LITERAL) (6.7%) uri.Contains(UNK_STRING_LITERAL) (32.4%) uri.Starts With(UNK_STRING_LITERAL) (29.2%) uri.Has Value() (7.7%) Published as a conference paper at ICLR 2019 start Pos = index + 1; int count = end Pos - start Pos + 1; word = (count > 0) ? input.Substring(start Pos, count) : String.Empty; Sample snippet from acat. The following suggestions were made: Seq Seq: UNK_TOKEN.Trim() (3.4%) UNK_TOKEN.Replace(UNK_STRING_LITERAL, UNK_STRING_LITERAL) (2.1%) UNK_TOKEN.Replace( UNK_CHAR , UNK_CHAR ) (3.4%) input[index] (1.4%) start Pos[input] (0.9%) input[count] (0.8%) val.Trim() (6.6%) input.Trim() (6.5%) input.Substring(UNK_NUM_LITERAL) (4.0%) UNK_STRING_LITERAL + UNK_STRING_LITERAL (8.4%) UNK_STRING_LITERAL + start Pos (7.8%) start Pos + UNK_STRING_LITERAL (7.8%) input.Trim() (15.6%) input.Substring(0) (6.4%) input.Replace(UNK_STRING_LITERAL, UNK_STRING_LITERAL) (2.8%) input.Trim() (7.8%) input.To Lower() (6.4%) input + UNK_STRING_LITERAL (5.6%) input+Start Pos (11.8%) input+count (9.5%) input.Substring(start Pos, end Pos - count) (6.3%) Published as a conference paper at ICLR 2019 protected virtual void Crawl Site() { while ( !_crawl Complete ) { Run Pre Work Checks(); if (_scheduler.Count > 0) { _thread Manager.Do Work( () => Process Page(_scheduler.Get Next())); } else if (!_thread Manager.Has Running Threads()) { _crawl Complete = true; } else { _logger.Debug Format("Waiting for links to be scheduled..."); Thread.Sleep(2500); } } } Sample snippet from Abot. The following suggestions were made: Seq Seq: !UNK_TOKEN (9.4%) UNK_TOKEN > 0 (2.6%) UNK_TOKEN != value (1.3%) !_max Pages To Crawl Limit Reached Or Scheduled (26.2%) !_crawl Cancellation Reported (26.0%) !_crawl Stop Reported (21.8%) !UNK_TOKEN (54.9%) !done (18.8%) !throw On Error (3.3%) !_crawl Cancellation Reported (23.6%) !_crawl Stop Reported (23.3%) !_max Pages To Crawl Limit Reached Or Scheduled (18.9%) !_crawl Stop Reported (26.6%) !_crawl Cancellation Reported (26.5%) !_max Pages To Crawl Limit Reached Or Scheduled (25.8%) !_crawl Stop Reported (19.6%) !_max Pages To Crawl Limit Reached Or Scheduled (19.0%) !_crawl Cancellation Reported (15.7%) !_crawl Stop Reported (38.4%) !_crawl Cancellation Reported (31.8%) !_max Pages To Crawl Limit Reached Or Scheduled (27.0%) Published as a conference paper at ICLR 2019 char character = original Name[i]; if ( character == < ) { ++start Tag Count; builder.Append( ); } else if (start Tag Count > 0) { if (character == > ) { --start Tag Count; } Sample snippet from Style Cop. The following suggestions were made: Seq Seq: x == UNK_CHAR_LITERAL (5.9%) UNK_TOKEN == 0 (3.3%) UNK_TOKEN > 0 (2.7%) !i == 0 (5.1%) character < 0 (2.7%) character (2.2%) character == UNK_CHAR_LITERAL (70.8%) character == UNK_CHAR_LITERAL || character == UNK_CHAR_LITERAL (5.8%) character != UNK_CHAR_LITERAL (3.1%) character == character (9.9%) UNK_CHAR_LITERAL == character (8.2%) character == UNK_CHAR_LITERAL (8.2%) character == UNK_CHAR_LITERAL (43.4%) character || character (3.3%) character == UNK_CHAR_LITERAL == UNK_CHAR_LITERAL (3.0%) character == UNK_CHAR_LITERAL (39.6%) character || character == UNK_STRING_LITERAL (5.2%) character == UNK_STRING_LITERAL (2.8%) character == UNK_CHAR_LITERAL (75.5%) character == (2.6%) character != UNK_CHAR (2.5%) Published as a conference paper at ICLR 2019 public void Allow Access(string path) { if (path == null) throw new Argument Null Exception("path"); if ( !path.Starts With(" /") ) throw new Argument Exception( string.Format( "The path \"{0}\" is not application relative." + " It must start with \"~/\".", path), "path"); paths.Add(path); } Sample snippet from cassette. The following suggestions were made: Seq Seq: UNK_TOKEN < 0 (14.6%) !UNK_TOKEN (7.5%) UNK_TOKEN == 0 (3.3%) path == UNK_STRING_LITERAL (18.1%) path <= 0 (5.6%) path == "" (4.8%) !UNK_TOKEN (48.0%) !discard Nulls (6.3%) !first (2.7%) !path (67.4%) path && path (8.4%) !!path (5.5%) !path (91.5%) !path && !path (0.9%) !path.Contains(UNK_STRING_LITERAL) (0.7%) !path (89.6%) !path && !path (1.5%) !path.Contains(UNK_STRING_LITERAL) (0.5%) !path (42.9%) !path.Starts With(UNK_STRING_LITERAL) (23.8%) !path.Contains(UNK_STRING_LITERAL) (5.9%) Published as a conference paper at ICLR 2019 int method Param Count = 0; IEnumerable module Parameters = Enumerable.Empty; if (param Count > 0) { IParameter Type Information[] module Parameter Arr = this.Get Module Parameter Type Informations(Dummy.Signature, param Count); method Param Count = module Parameter Arr.Length; if (method Param Count > 0) module Parameters = Iterator Helper.Get Readonly(module Parameter Arr); } IEnumerable module Varargs Parameters = Enumerable.Empty; if ( param Count > method Param Count ) { IParameter Type Information[] module Parameter Arr = this.Get Module Parameter Type Informations( Dummy.Signature, param Count - method Param Count); if (module Parameter Arr.Length > 0) module Varargs Parameters = Iterator Helper.Get Readonly(module Parameter Arr); } Sample snippet from Afterthought. The following suggestions were made: Seq Seq: !UNK_TOKEN (10.9%) UNK_TOKEN == UNK_TOKEN (4.6%) UNK_TOKEN == UNK_STRING_LITERAL (3.3%) dummy Pinned != 0 (2.2%) param Count != 0 (2.1%) dummy Pinned == 0 (1.5%) new Value > 0 (9.7%) zeroes > 0 (9.0%) param Count > 0 (6.0%) method Param Count == method Param Count (3.4%) 0 == method Param Count (2.8%) method Param Count == param Count (2.8%) param Count == 0 (12.7%) param Count < 0 (11.5%) param Count > 0 (8.0%) method Param Count > 0 (10.9%) param Count > 0 (7.9%) method Param Count != 0 (5.6%) param Count > method Param Count (34.4%) param Count == method Param Count (11.4%) param Count < method Param Count (10.0%) Published as a conference paper at ICLR 2019 public Code Location(int index, int end Index, int index On Line, int end Index On Line, int line Number, int end Line Number) { Param.Require Greater Than Or Equal To Zero(index, "index"); Param.Require Greater Than Or Equal To(end Index, index, "end Index"); Param.Require Greater Than Or Equal To Zero(index On Line, "index On Line"); Param.Require Greater Than Or Equal To Zero(end Index On Line, "end Index On Line"); Param.Require Greater Than Zero(line Number, "line Number"); Param.Require Greater Than Or Equal To(end Line Number, line Number, "end Line Number"); // If the entire segment is on the same line, // make sure the end index is greater or equal to the start index. if ( line Number == end Line Number ) { Debug.Assert(end Index On Line >= index On Line, "The end index must be greater than the start index," + " since they are both on the same line."); } this.start Point = new Code Point(index, index On Line, line Number); this.end Point = new Code Point(end Index, end Index On Line, end Line Number); } Sample snippet from Style Cop. The following suggestions were made: Seq Seq: !UNK_TOKEN (14.0%) UNK_TOKEN == 0 (4.4%) UNK_TOKEN > 0 (3.5%) end Index < 0 (3.8%) end Index > 0 (3.4%) end Index == 0 (2.2%) line Number < 0 (9.4%) line Number == 0 (7.4%) line Number <= 0 (5.1%) line Number == line Number (3.4%) 0 == line Number (2.5%) line Number > line Number (2.5%) end Line Number == 0 (9.6%) end Line Number < 0 (7.9%) end Line Number > 0 (6.1%) line Number > 0 (11.3%) line Number == 0 (7.3%) line Number != 0 (6.7%) line Number > end Line Number (20.7%) line Number < end Line Number (16.5%) line Number == end Line Number (16.2%) Published as a conference paper at ICLR 2019 public static Bitmap Rotate Image(Image img, float angle Degrees, bool upsize, bool clip) { // Test for zero rotation and return a clone of the input image if (angle Degrees == 0f) return (Bitmap)img.Clone(); // Set up old and new image dimensions, assuming upsizing not wanted // and clipping OK int old Width = img.Width; int old Height = img.Height; int new Width = old Width; int new Height = old Height; float scale Factor = 1f; // If upsizing wanted or clipping not OK calculate the size of the // resulting bitmap if ( upsize || !clip ) { double angle Radians = angle Degrees * Math.PI / 180d; double cos = Math.Abs(Math.Cos(angle Radians)); double sin = Math.Abs(Math.Sin(angle Radians)); new Width = (int)Math.Round((old Width * cos) + (old Height * sin)); new Height = (int)Math.Round((old Width * sin) + (old Height * cos)); } // If upsizing not wanted and clipping not OK need a scaling factor if (!upsize && !clip) { scale Factor = Math.Min((float)old Width / new Width, (float)old Height / new Height); new Width = old Width; new Height = old Height; } Sample snippet from Share X. The following suggestions were made: Seq Seq: UNK_TOKEN > 0 (8.3%) !UNK_TOKEN (4.4%) UNK_TOKEN == 0 (2.6%) new Height > 0 (5.1%) clip > 0 (3.2%) old Width > 0 (2.9%) UNK_TOKEN && UNK_TOKEN (15.0%) UNK_TOKEN || UNK_TOKEN (13.6%) trusted For Delegation && !app Only (12.1%) upsize && upsize (21.5%) upsize && clip (10.9%) clip && upsize (10.9%) upsize && clip (13.9%) upsize && !clip (9.8%) clip && clip (9.3%) upsize && !upsize (6.9%) clip && !upsize (6.3%) upsize || upsize (5.7%) upsize || clip (19.1%) upsize && clip (18.8%) upsize && ! clip (12.2%) 24