# kelp_a_kernelbased_learning_platform__a8bf9dcb.pdf Journal of Machine Learning Research 18 (2018) 1-5 Submitted 2/16; Revised 10/17; Published 4/18 Ke LP: a Kernel-based Learning Platform Simone Filice filice@info.uniroma2.it DICII, University of Roma, Tor Vergata, Italy Giuseppe Castellucci castellucci@ing.uniroma2.it DIE, University of Roma, Tor Vergata, Italy Giovanni Da San Martino gmartino@hbku.edu.qa Qatar Computing Research Institute, HKBU, Qatar Alessandro Moschitti amosch@amazon.com Amazon Danilo Croce croce@info.uniroma2.it Roberto Basili basili@info.uniroma2.it DII, University of Roma, Tor Vergata, Italy Editor: Cheng Soon Ong Abstract Ke LP is a Java framework that enables fast and easy implementation of kernel functions over discrete data, such as strings, trees or graphs and their combination with standard vectorial kernels. Additionally, it provides several kernel-based algorithms, e.g., online and batch kernel machines for classification, regression and clustering, and a Java environment for easy implementation of new algorithms. Ke LP is a versatile toolkit, very appealing both to experts and practitioners of machine learning and Java language programming, who can find extensive documentation, tutorials and examples of increasing complexity on the accompanying website. Interestingly, Ke LP can be also used without any knowledge of Java programming through command line tools and JSON/XML interfaces enabling the declaration and instantiation of articulated learning models using simple templates. Finally, the extensive use of modularity and interfaces in Ke LP enables developers to easily extend it with their own kernels and algorithms. Keywords: Kernel Machines, Structured Data and Kernels, Java Framework. 1. Introduction Kernel methods for discrete structures (Shawe-Taylor and Cristianini, 2004) are popular and effective techniques for the design of learning algorithms on non-vectorial data, such as strings (Lodhi et al., 2002), trees (Collins and Duffy, 2002; Moschitti, 2006; Aiolli et al., 2009; Croce et al., 2011; Annesi et al., 2014) and graphs (G artner, 2003; Borgwardt and Kriegel, 2005; Shervashidze, 2011). These kernels are very valuable to model complex relations in real-world applications, where data naturally has a structured form, e.g., strings and graphs are used to represent DNA and chemical compounds, or parse trees can encode syntactic and semantic information expressed in text. However, current software for structural kernels is mainly limited to specific research, and is often not made publicly available or easily adaptable to new application domains. SVM-Light-TK toolkit by Moschitti (2006) is one of few exceptions that provides the user with different string and tree kernels but no graph kernels. It is written in C language . Professor at the University of Trento, Italy. c 2018 Simone Filice, Giuseppe Castellucci, Giovanni Da San Martino, Alessandro Moschitti, Danilo Croce, Roberto Basili. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v18/16-087.html. Filice et al thus extending it with new kernels can be costly, especially when new data structures are required. This may also prevent non programmers to use it for their specific applications. In designing Ke LP, we have capitalized on our previous experience with SVM-Light TK and other toolkits to foster the reuse of previous software and models as well as their extendibility. We provide a software platform for learning on structured data, which is both easy to use for unexperienced users and easily extendable for developers. Ke LP includes many standard kernel algorithms for classification, regression and clustering as well as popular kernel functions for strings, trees and graphs. Additionally, it includes kernel functions for modeling relations between pairs of objects, which are, e.g., required in paraphrasing detection, textual entailment and question answering (Moschitti and Zanzotto, 2007; Filice et al., 2015; Tymoshenko and Moschitti, 2015). Most importantly, new data structures, models, algorithms and kernels can be easily added on top of the previous code, facilitating and promoting the development of a library of kernel-based algorithms for structured data. The Ke LP source code is distributed under the terms of Apache 2.0 License. No additional software is required to be installed in order to use it, the Apache Maven project management tool resolves all module dependencies automatically. We also provide and maintain a website with updated tutorials and documentation. 2. The Ke LP Framework: an Overview Ke LP is written in Java and uses three different Maven projects to logically separate its three main components: (i) the framework backbone implements classification, regression and clustering algorithms operating on vector-based kernels. These core modules along with SVMs1 are always part of any framework instantiation. (ii) Additional-algorithm packages, e.g., online kernel machines, Nystr om method (Williams and Seeger, 2001) and label sequence learning (Altun et al., 2003), and (iii) additional-kernel packages, which include kernel functions for sequences, trees and graphs. A complete and up-to-date list of algorithms and kernel functions, a full Javadoc API documentation in PDF, and tutorials for both end-users and developers are hosted on the Ke LP website, http://www.kelp-ml.org. 2.1 Machine Learning Algorithms Learning algorithms in Ke LP are implemented following implementation contracts provided by specific Java interfaces for different scenarios, i.e., classification, regression and clustering, according to two main learning paradigms, i.e., batch and online. New learning algorithms can implement these interfaces, thus becoming fully integrated with the other library functions. More in detail: (i) The Classification Learning Algorithm interface supports the definition of classification learning methods, such as SVMs (Chang and Lin, 2011) or the Dual Coordinate Descent (Hsieh et al., 2008). (ii) The Regression Learning Algorithm interface supports the definition of regressors, such as ϵ-SVR (Chang and Lin, 2011). (iii) The Clustering Algorithm interface enables the implementation of clustering algorithms, such as (Kulis et al., 2005). (iv) The Online Learning Algorithm interface supports the definition of online learning algorithms, e.g., Passive Aggressive (Crammer et al., 2006), or the Soft Confidence Weighted (Wang et al., 2012) algorithms. Finally, (v) the Meta Learning Algorithm interface enables the design of committees, such as the multiclassification schemas, e.g., One-VS-One and One-VS-All. 1. We include it because of its wide use. Ke LP - a Kernel-based Learning Platform { "algorithm": "binary CSvm Classification", "c": 10, "kernel": { "kernel Type": "linear Comb", "weights": [ 1, 1 ], "to Combine": [ { "kernel Type": "norm", "base Kernel": { "kernel Type": "ptk , "representation": constituent-tree", "mu": 0.4, "lambda": 0.4, "terminal Factor": 1.0 } }, { "kernel Type": "linear", "representation": wordspace } ] } Linear combination Kernel normalization Partial Tree Linear Kernel Figure 1: A JSON description of a SVM classifier. 2.2 Data Representation In Ke LP, data is represented by the Example class, which is constituted by (i) a set of Labels and (ii) a set of Representations. The former enables the design of single or multilabel classifiers and multi-variate regressors. The latter model examples in terms of vectors (e.g., Dense Vector and Sparse Vector) or structures (e.g., Sequence Representation, Tree Representation or Graph Representation). In particular, kernels can be defined over examples encoded by multiple representations (e.g., multiple parse trees, strings, graphs and feature vectors). This makes the experimentation with multiple kernel combinations easy, just requiring negligible changes in the code or the JSON description (see Section 2.4), without the need of modifying the input data sets. Additionally, the examples can be combined in more complex structures, e.g., Example Pair, useful to learn relations between objects, e.g., pairs representing question and answer text in QA, or text and hypothesis in textual entailment tasks. Building other types of data format is extremely simple, e.g., Ke LP includes the SVM-Light-TK input format for trees and provides many scripts to use the popular gspan format for graphs (and indirectly for the 111 open Babel formats2). 2.3 Building Kernels from Kernels Ke LP enables (i) kernel composition, i.e., Kab(s1, s2) = (φa φb)(s1) (φa φb)(s2) from Ka(s1, s2) = φa(s1) φa(s2) and Kb(s1, s2) = φb(s1) φb(s2); and (ii) kernel combinations, e.g., λ1Ka(s1, s2) + λ2Kb(s1, s2) Ka(s1, s2). These operations are coded using three abstractions of the Kernel class: (i) Direct Kernel directly operates on a specified Representation object, derived from the Example object (e.g., implementing kernels for vectors, sequences, trees and graphs). (ii) The Kernel Composition class composes Kernel objects, e.g., Polynomial Kernel, RBFKernel and Normalization Kernel. (iii) Kernel Combination class enables the combination of different Kernels, e.g., the Linear Kernel Combination class applies a weighted kernel sum. (iv) Kernel On Pair class operates 2. http://openbabel.org Filice et al on Example Pair, e.g., to learn similarity functions between sentences (Filice et al., 2015) or to implement ranking algorithms with the Preference Kernel class. public static void run(String train Path , String test Path , String learning Algo Path ){ // Define (load) the learning algorithm (see the JSON in Fig. 1) Jackson Serializer Wrapper serializer = new Jackson Serializer Wrapper (); Classification Learning Algorithm learning Algo ; learning Algo = serializer .read Value(new File( learning Algo Path ), Classification Learning Algorithm .class); // Load the datasets Simple Dataset train Dataset = new Simple Dataset (); train Dataset .populate(train Path); Simple Dataset test Dataset = new Simple Dataset (); test Dataset.populate(test Path); // Learn the classifier List