# fact_checking_via_evidence_patterns__1765d5ab.pdf Fact Checking via Evidence Patterns Valeria Fionda1, Giuseppe Pirr o2, 1 De Ma CS, University of Calabria, Rende, Italy 2 ICAR-CNR, Rende, Italy fionda@mat.unical.it, pirro@icar.cnr.it We tackle fact checking using Knowledge Graphs (KGs) as a source of background knowledge. Our approach leverages the KG schema to generate candidate evidence patterns, that is, schema-level paths that capture the semantics of a target fact in alternative ways. Patterns verified in the data are used to both assemble semantic evidence for a fact and provide a numerical assessment of its truthfulness. We present efficient algorithms to generate and verify evidence patterns, and assemble evidence. We also provide a translation of the core of our algorithms into the SPARQL query language. Not only our approach is faster than the state of the art and offers comparable accuracy, but it can also use any SPARQL-enabled KG. 1 Introduction We live in a digital era where both false and true rumors spread at an unprecedented speed. Having a way to quickly assess the reliability of claims becomes crucial. Most of existing computational approaches (e.g., [Zubiaga et al., 2017; Hassan et al., 2017]) detect facts in large corpora (e.g., the Web, Twitter) and leverage indicators (e.g., tweet popularity) and other (mostly syntactic) features to measure their credibility. The success of these approaches depends on the accuracy of fact spotting, the capability to assemble background knowledge (e.g., related facts), and the response time. The broad goal of this paper is to tackle fact checking from a semantic point of view by using Knowledge Graphs (KGs) like DBpedia and Yago that provide semantically-rich and automatically processable knowledge. KGs maintain atomic facts of the form (Dune, director, D. Lynch) structured thanks to a schema, which allows to understand the constituent parts of a fact (e.g., the fact is about a Film directed by a Person). We present an approach that can contextualize facts by identifying semantically related facts (e.g., facts about Actors of a Film). Differently from previous work, which starts from the data to surface evidence for a fact, we leverage the schema and a predicate relatedness measure to generate candidate evidence patterns, that is, schema-compliant paths that capture the semantics of a fact in alternative ways. Semantic evidence is assembled from evidence patterns that are verified in the data. Interestingly, verified evidence patterns can be reused for facts of the same kind. Our approach is scalable thanks to efficient algorithms for candidate generation, verification and evidence building. We implemented our algorithms both in memory and by a translation into SPARQL [Harris et al., 2013]. This latter aspect allows to use any SPARQL-enabled KG. Not only our approach offers an off-the-shelf fact checking solution, but it can also support existing approaches to seamlessly incorporate structured knowledge into the loop. 1.1 Related Work Fact checking has been studied from different perspectives. [Gupta et al., 2014; Ratkiewicz et al., 2011] tackle misinformation spread in social networks; while considering contextual/user information (e.g., credibility) these approaches do not analyze the semantics of (related) facts. [Nickel et al., 2016; Bordes et al., 2013; Lin et al., 2015] leverage embeddings for entities/predicates. Both the computation of embeddings and the fact checking process are not scalable and depend on the embedding. [Wu et al., 2014] focus on specific types of facts only (i.e., window aggregates). Claim Buster [Hassan et al., 2017] detects claims in political discourses; neither it does consider structured knowledge nor it allows to build (reusable) semantic evidence. Logic-based approaches [Leblay, 2017] focus on how to represent/query facts and capture incompleteness/uncertainty; in [Leblay, 2017] no experimental evaluation is discussed. Our work differs from rule-learning techniques (e.g., [Gal arraga et al., 2015; Gad-Elrab et al., 2016]); we focus on fact checking for which we can quickly generate semantic evidence and provide a numerical assessment by proceeding from the schema (and not from the data) and only considering semantically related facts. Our approach differs from other KGs-based approaches (e.g., [Shiralkar et al., 2017; Ciampaglia et al., 2015; Shi and Weninger, 2016]) since: (i) instead of starting from the data to surface evidence, we start from the KG-schema, which allows to assemble semantic evidence both more quickly and precisely, by focusing first on evidence patterns that are semantically related with the target fact; (ii) our approach can provide verdicts in a few seconds, instead of hundreds of seconds like, for instance, [Shiralkar et al., 2017]; (iii) our approach can scale to very large (online available) KGs thanks to the SPARQL-based implementation. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) Artist subclass Of Person Actor subclass Of Actor Film subclass Of Work Place subclass Of Agent Person subclass Of Agent producer domain Work producer range Agent starring domain Work starring range Actor director domain Film director range Person writer domain Work writer range Person editing domain Film editing range Person director domain Work influenced By domain Person influenced By range Person Musical Work death Place notable Works cinematography influenced By cinematography (a) Schema Graph Dune David Lynch Film Person Fact Template Target Fact (c) (d) subclass Of subclass Of subclass Of subclass Of subclass Of Person Film Person Film director cinematography cinematography Person Work Actor Work director starring starring Person Film Person Film editing director cinematography Candidates Evidence Patterns(e) David Lynch Dune director starring starring David Lynch Dune director cinematography cinematography David Lynch Dune editing director cinematography Evidence Patterns Verification(f) subclass Of Figure 1: Given the KG schema (a), we build the schema graph (b). Given a fact (c), we generate a fact template by using type information (d) and then generate candidate evidence patterns using the schema graph (e) that will be verified in the data (f) to collect semantic evidence. 1.2 Running Example Consider the fact (Dune, director, D. Lynch) show in Fig. 1 (c) and the DBpedia KG as source of background knowledge. Schema Graph. Our approach leverages the class hierarchy, domain and range of predicates defined in the KG schema along with RDFS inference rules [Munoz et al., 2009; Franconi et al., 2013], to build a schema graph where nodes are classes (entity types) and edges predicates. Fig. 1 (a) shows an excerpt of the DBpedia schema. As an example, the predicate director has Film as one of its domains and Person as range; moreover, Film is subclass Of of Work. Fig. 1 (b) shows an excerpt of the corresponding schema graph; there are two edges labeled as director; the first: (Film, director, Person) directly derived from domain and range, while the second one: (Work, director, Person) obtained by deriving a new domain (i.e., Work) for the predicate director using RDFS inference. Candidate Evidence Patterns. Our approach leverages the schema graph to generate candidate evidence patterns, that is, schema-compliant paths that provide an alternative way for expressing a target fact template (Fig. 1 (d)). As the number of potential (schema-compliant) paths can be large we adopt two pruning strategies. The first bounds pattern length to avoid the inclusion of pieces of knowledge that may hinder the contextualization/understanding of the fact. The second one restricts the types of predicates considered during the traversal to only those semantically related (see Section 3.1) to the target predicate. For instance, when checking a fact about director predicates like population can be misleading. Fig. 1 (e) shows some evidence patterns of length 3 when considering the top-3 most predicates related to director (e.g, director, starring, cinematography,). The first pattern in Fig. 1 (e) shows an alternative way of expressing the fact template in Fig. 1 (d): a Film (the superclass Work in this case) can have an outgoing starring edge that links it to entities of type Actor in the underlying KG, which in turns have incoming edges from entities of type Work via starring edges; these are finally linked to entities of type Person via director edges. Candidate evidence pattern are generated by traversing the schema graph starting from the source entity types toward the target entity types; in our example from {Film, Work} (for Dune) to Person (for D. Lynch). Evidence Patterns Verification. Candidate evidence patterns tell us how the subject and object entity types of a fact could be connected in the data; however, not all of them are verified (i.e., have bindings) in the data. Candidate patterns verification is performed by: (i) instantiating the pattern endpoints with the subject and object of the target statement (Fig. 1 (f)) and; (ii) traversing the data graph to check whether the subject can reach the object according to the sequence of predicates in the pattern. To verify the first evidence pattern, our verification algorithm starts from Dune in the KG and by traversing the sequence of edges (according to their directions) starring starring director checks if D. Lynch can be reached. By traversing starring edges from Dune it is possible to reach entities like J. Nance; from here, by traversing starring edges backward, nodes like Twin Peaks can be reached. From this latter set of nodes, director edges are traversed forward toward D. Lynch. In this example, only the first two patterns are verified in the KG data. Evidence (paths in the KG) from verified patterns allows to build an evidence graph. Freddie Francis The Elephant Man cinematography The Straight Story David Lynch Fire Walks with Me Kyle Mclaughlin cinematography cinematography Figure 2: Evidence supporting the fact (Dune, director, D. Lynch). Fig. 2 shows an excerpt of the evidence graph built from the first two patterns in Fig. 1 (e). Actors like K. Mclaughlin and J. Nance starred Dune and other movies directed by D. Lynch; F. Francis was responsible for the cinematography of Dune and Elephant Man directed by D. Lynch. These chains of facts support the (true) fact that D. Lynch directed Dune. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) Evidence score. Evidence as shown in Fig. 2 takes some human effort to parse and understand. To fully automate the fact checking process and numerically distill evidence we devised the evidence score. Given a pattern, the score takes into account both in how many ways the subject to the object are linked, and how specific are the object and the subject. As an example, the first and the second evidence patterns link the subject and the object in 3 and 2 different ways, respectively (see Fig. 2). Moreover, the first evidence pattern is more specific than the second wrt the subject; this is because if instead of D.Lynch there were an arbitrary Person then the first pattern and second pattern would have 488 and 26 counts, respectively. Frequency and specificity can be combined according to different similarity measures (e.g., DICE, Tversky) to provide an evidence score between 0 and 1. In our example, for the first evidence pattern in Fig. 1 (e) the evidence score is 0.22 by using the Tversky s measure. A final evidence score can be assembled by averaging the scores from individual (verified) patterns. Reusing Patterns. Consider the fact (Kill Bill, director, Q. Tarantino). To check this fact we can reuse the candidate evidence patterns generated in the previous example because the fact template (Fig. 1 (d)) is the same. Even in this case the first pattern is verified in the data as S. Jackson also acted in Kill Bill 2 and Jango Unchained that were directed by Q. Tarantino. False facts like (Dune, director, A. Jodorowski) or (The Godfather Part II, director, Michael Mann) do not verify any candidate pattern in Fig. 1 (e). 1.3 Contributions and Outline We tackle the problem of fact checking by a novel approach that generates evidence patterns from the KG schema and use them to assemble semantic evidence from the data. We contribute: (i) a fact checking approach based on the idea of evidence patterns; (ii) efficient algorithms to generate and verify evidence patterns, and assemble semantic evidence; (iii) the evidence score as a final distillation of the fact checking process; (iv) two implementations (one in-memory and another SPARQL-based), an experimental evaluation, and comparison with related research. The remainder of the paper is organized as follows. We introduce some background in Section 2. Section 3 describes our approach. The implementation and experimental evaluation is discussed in Section 4. We conclude in Section 5. 2 Preliminaries A Knowledge Graph (KG) is a directed node and edge labeled multi-graph G=(V, E, T) where V is a set of uniquely identified vertices representing entities (e.g., D. Lynch, France), E a set of predicates (e.g., director, nationality) and T a set of statements of the form (s, p, o) representing directed labeled edges, where s, o V and p E. To structure knowledge, KGs resort to an underlying schema. In this paper we focus on the portion of the schema that concerns entity types (and their hierarchy) and domain and range of predicates. Our approach leverages RDFS inference rules to construct the RDFS-closure of the schema [Munoz et al., 2009; Franconi et al., 2013]. From the closure of the schema we build the corresponding schema graph closure. Definition 1. (Schema Graph Closure). Given a KG schema, its closure is defined as GS =(Vs, Es, Ts), where each vi Vs is a node denoting an entity type, each pi Es denotes a predicate and each (vs, pi, vt) Ts is a triple (asserted or inferred via RDFS reasoning) where vi (resp., vt) is the domain (resp., range) of the predicate pi. Fig. 1 (a) shows an excerpt of the DBpedia schema while Fig. 1 (b) of the schema closure; here, dashed lines represent inferred triples. For instance, the triple (Work, director, Person) is inferred from the triples (Film, subclass Of, Work) and (Film, director, Person). Given a predicate p, Dom(p) (resp., Range(p)) denotes the set of nodes (i.e., entity types) in GS having an outgoing (resp., incoming) edge labeled with p. We refer to the schema graph closure simply as the schema graph. Definition 2. (Evidence Pattern). Given a schema graph GS=(Vs, Es, Ts), a target predicate p and an integer d, an evidence pattern is a sequence Π(p ) = π1 π2, .... πl-1 πl, with l d, such that πi=(ti, pi, ti+1) for i {1, ..., l}, { , }, ti Vs for i {1, ...l}, pj Es for j {1, ...l-1}, t1 Dom(p ), tl Range(p ) and denotes the composition (i.e., join) operator. The operator { , } determines the direction of an edge in the pattern. By looking at Fig. 1 (b), an example of evidence pattern is Π(director)=π1 π2 π3 where π1=(Work, starring, blue Actor), π2=(blue Actor, starring, brown Work), π3=(brown Work director, Person). 3 Fact Checking via Evidence Patterns To verify a fact (s, p , o) our approach proceeds as follows: (i) generate candidate evidence patterns (if not already available) using the schema graph and predicate relatedness; (ii) verify evidence patterns and build an evidence graph; (iii) optionally, output an evidence score. 3.1 Predicate Relatedness The first ingredient of our approach is a predicate relatedness measure, which allows to isolate the portion of the schema (and thus of the underlying data) that is of most interest for a particular fact. Its goal is to rank predicates in a graph G = (V, E, T) wrt p . Given (pi, pj) E, the relatedness measure takes into account the co-occurrence of pi and pj in the set of triples T and weights it by the predicate popularity. This approach resembles the TF-ITF scheme used in information retrieval [Shiralkar et al., 2017; Pirr o, 2012]. We now introduce its building blocks. TF(pi, pj) = log(1 + Ci,j) (1) where Ci,j counts the number of s and o in triples of the form (s, pi, o) and (s, pj, o) or (o, pi, s) and (o, pj, s). ITF is defined as follows: ITF(pj, E) = log |E| |{pi : Ci,j > 0}| (2) Having co-occurrences for each pair (pi, pj) weighted by the ITF allows to build a co-occurrence matrix: wi,j(pi, pj, E) = TF(pi, pj) ITF(pj, E) (3) The relatedness between pi and pj is then computed as: Rel(pi, pj) = Cosine(Wi, Wj) (4) where Wi (resp., Wj) is the row of pi (resp., pj). Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) 3.2 Finding Evidence Patterns We now describe the candidate evidence patterns generation algorithm (Algorithm 1). It takes as input a schema graph, a predicate p (the predicate in the target fact), an integer d to bound the length of the pattern, an integer k to pick the topk most related predicates to p , and uses a priority queue to store candidate patterns ranked by their relatedness wrt p . Algorithm 1: find Evidence Patterns Input: A schema graph GS=(Vs, Es, Ts), a target predicate p , maximum depth d, number of related predicates k Output: Evidence Patterns P 1 P = ; /* priority queue based on average relatedness to p */ 2 R= get Related Predicates(p ,k) 3 Parallel execution: 4 for each predicate pi R do 6 for (tr, pi, ts) Ts do 7 if tr source Nodes(p ) then 8 1= 1 {(tr, pi, ts)} 9 if ts target Nodes(p ) then 10 P = P {(tr, pi, ts)} 11 if ts source Nodes(p ) then 12 1= 1 {(tr, pi, ts)} 13 if tr target Nodes(p ) then 14 P = P {(tr, pi, ts)} 15 for j = 2, ..., d do 16 Let j = ; 17 for each evidence pattern π j 1 do 18 for each entity type ns out Nodes(π) do 19 for each triple (ns, p, nt) Ts s.t. p R do 20 j = j {π (ns, p, nt)} 21 for each triple (nt, p, ns) Ts s.t. p R do 22 j = j {π (ns, p, nt)} 23 for each evidence pattern Πj j do 24 if (check Types(πj, p ) = true) then 25 P = P {Πj} 26 return P The algorithm for each of the most related predicates (line 4) in parallel: performs a traversal of the schema graph by checking that predicate pi and predicate p both have as source node tr (lines 7) (resp., ts (line 11)); this allows to build length-1 evidence patterns that are added to the results if they allow to reach the same node ts (line 9) (resp., tr (line 13)). Length-1 evidence patterns are expanded (line 11-25). The algorithm takes a q-length (partial) pattern (with q