# lifted_inference_with_linear_order_axiom__9d09a3de.pdf Lifted Inference with Linear Order Axiom Jan T oth, Ondˇrej Kuˇzelka Faculty of Electrical Engineering Czech Technical University in Prague Prague, Czech Republic {tothjan2, ondrej.kuzelka}@fel.cvut.cz We consider the task of weighted first-order model counting (WFOMC) used for probabilistic inference in the area of statistical relational learning. Given a formula ϕ, domain size n and a pair of weight functions, what is the weighted sum of all models of ϕ over a domain of size n? It was shown that computing WFOMC of any logical sentence with at most two logical variables can be done in time polynomial in n. However, it was also shown that the task is #P1-complete once we add the third variable, which inspired the search for extensions of the two-variable fragment that would still permit a running time polynomial in n. One of such extension is the two-variable fragment with counting quantifiers. In this paper, we prove that adding a linear order axiom (which forces one of the predicates in ϕ to introduce a linear ordering of the domain elements in each model of ϕ) on top of the counting quantifiers still permits a computation time polynomial in the domain size. We present a new dynamic programming-based algorithm which can compute WFOMC with linear order in time polynomial in n, thus proving our primary claim. Introduction The task of probabilistic inference is at the core of many statistical machine learning problems and much effort has been invested into performing inference faster. One of the techniques, aimed mostly at problems from the area of statistical relational learning (Getoor and Taskar 2007), is lifted inference (Van den Broeck et al. 2021). A very popular way to perform lifted inference is to encode the particular problem as an instance of the weighted first-order model counting (WFOMC) problem. It is worth noting that applications of WFOMC range much wider, making it an interesting research subject in its own right. For instance, it was used to aid in conjecturing recursive formulas in enumerative combinatorics (Barv ınek et al. 2021). Computing WFOMC in the two-variable fragment of firstorder logic (denoted as FO2) can be done in time polynomial in the domain size, which is also referred to as FO2 being domain-liftable (Van den Broeck 2011). Unfortunately, it was also shown that the same does not hold in FO3 where the problem turns out to be #P1-complete in general (Beame et al. 2015). That has inspired a search for extensions of FO2 Copyright 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. that would still be domain-liftable. Several new classes have been identified since then. Kazemi et al. (2016) introduced S2FO2 and S2RU. Kuusisto and Lutz (2018) extended the two-variable fragment with one functionality axiom and showed such language to still be domain-liftable. That result was later generalized to the two-variable fragment with counting quantifiers, denoted C2(Kuˇzelka 2021). Moreover, van Bremen and Kuˇzelka (2021b) proved that C2 extended by the tree axiom is still domain-liftable as well.1 Another extension of C2 can be obtained by adding a linear order axiom. Linear order axiom (Libkin 2004) enforces some relation in the language to introduce a linear (total) ordering on the domain elements. Such a constraint is inexpressible using only two variables, requiring special treatment. This logic fragment has also received some attention from logicians (Charatonik and Witkowski 2015). In this paper, we show that extending C2 with a linear order axiom yields another domain-liftable language. We present a new dynamic programming-based algorithm for computing WFOMC in C2 with linear order. The algorithm s running time is polynomial in the domain size meaning that C2 with linear order is domain-liftable. Even though our result is mostly of theoretical interest, we still provide some interesting applications and experiments. Among others, we perform exact inference in a Markov Logic Network (Richardson and Domingos 2006) on a random graph model similar to the one of Watts and Strogatz (Watts and Strogatz 1998).2 Background Let us now review necessary concepts, definitions and assumptions as well as notation. We use boldface letters such as k to differentiate vectors from scalar values such as n. If we do not name individual vector components such as k = (k1, k2, . . . , kd), then the i-th element of k is denoted by (k)i. Since our vectors only have non-negative entries, the sum of vector elements, i.e., 1Other recent works in lifted inference not directly related to our work presented here are works of van Bremen and Kuˇzelka (2021a), Malhotra and Serafini (2022) and Wang et al. (2022). 2This paper is accompanied by a technical report available at https://arxiv.org/abs/2211.01164 The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23) Pd i=1(k)i, always coincides with the L1-norm. Hence, we use |k| as a shorthand for the sum. We also introduce special name δj for a vector such that (δj)i = 1 if i = j, 0 otherwise. For a vector k = (k1, k2, . . . , kd) with |k| = n, |k| k = n k1, k2, . . . , kd denotes the multinomial coefficient. We make use of one non-trivial identity of multinomial coefficients (Berge 1971), namely We also assume the set of natural numbers N to contain zero and that 00 = 1. We use [n] to denote the set { 1, 2, . . . , n }. First-Order Logic We work with a function-free subset of first-order logic. The language is defined by a finite set of constants , a finite set of variables V and a finite set of predicates P. If the arity of a predicate P P is k, we also write P/k. An atom has the form P(t1, t2, . . . , tk) where P/k P and ti V. A literal is an atom or its negation. A formula is an atom and a literal. More complex formulas may be formed from existing formulas by logical connectives, or by surrounding them with a universal ( x) or an existential ( x) quantifier where x V. A variable x in a formula is called free if the formula contains no quantification over x. A formula is called a sentence if it contains no free variables. A formula is called ground if it contains no variables. As is customary in computer science, we adopt the Herbrand semantics (Hinrichs and Genesereth 2006) with a finite domain. Since we have a finite domain with a one-toone correspondence to the constant symbols, we denote the domain also with . We denote the Herbrand base by HB. We use ω to denote a possible world, i.e., any subset of HB. When we wish to restrict a possible world ω to only atoms with a particular predicate P, we write ω[P]. We work with logical sentences containing at most two variables (the language of FO2). We assume our FO2 sentences to be constant-free. Dealing with constants in lifted inference is a challenge in its own right. Treatment of conditioning on evidence as well as using constants in sentences is available in other literature (Van Den Broeck and Davis 2012; Van Haaren et al. 2016). Weighted Model Counting and Lifted Formulation Throughout this paper, we study the weighted first-order model counting. We will also make use of its propositional variant, the weighted model counting. Let us formally define both these tasks. Definition 1. (Weighted Model Counting) Let ϕ be a logical formula over some propositional language L. Let HB denote the Hebrand base of L (i.e., the set of all propositional variables). Let w : HB 7 R and w : HB 7 R be a pair of weightings assigning a positive and a negative weight to each variable in L. We define WMC(ϕ, w, w) = X Definition 2. (Weighted First-Order Model Counting) Let ϕ be a logical formula over some relational language L. Let n be the domain size. Let HB denote the Hebrand base of L over the domain = { 1, 2, . . . , n }. Let P be the set of the predicates of the language L and let pred : HB 7 P map each atom to its corresponding predicate symbol. Let w : P 7 R and w : P 7 R be a pair of weightings assigning a positive and a negative weight to each predicate in L. We define WFOMC(ϕ, n, w, w) = X l ω w(pred(l)) Y w(pred(l)). Remark 1. Since for any domain of size n, we can define a bijective mapping π such that π( ) = { 1, 2, . . . , n }, WFOMC is defined for an arbitrary domain of size n. Cells and Domain-Liftability of FO2 We will not build on the original proof of domain-liftability of FO2 (Van den Broeck 2011; Van den Broeck, Meert, and Darwiche 2014), but rather on the more recent one (Beame et al. 2015). Let us review some parts of that proof as we make use of them later in the paper. An important concept is the one of a cell. Definition 3. A cell of a first-order formula ϕ is a maximally consistent conjunction of literals formed from atoms in ϕ using only a single variable. We will denote cells as C1(x), C2(x), . . . , Cp(x) and assume that they are ordered (indexed). Note, however, that the ordering is purely arbitrary. Example 1. Consider ϕ = Sm(x) Fr(x, y) Sm(y). Then there are four cells: C1(x) = Sm(x) Fr(x, x), C2(x) = Sm(x) Fr(x, x), C3(x) = Sm(x) Fr(x, x), C4(x) = Sm(x) Fr(x, x). It turns out, that if we fix a particular assignment of domain elements to the cells and if we then condition on such evidence, the WFOMC computation decomposes into mutually independent and symmetric parts, simplifying the computation significantly. When we say assignment of domain elements to cells, we mean a domain partitioning allowing empty partitions, that is ordered with respect to a chosen cell ordering. Each partition Sj then holds the constants assigned to the cell Cj. Such partitioning can be captured by a vector. We call such a vector a partitioning vector and often shorten the term to a p-vector. Definition 4. Let C1, C2, . . . , Cp be cells of some logical formula. Let n be the number of elements in a domain. A partitioning vector (or a p-vector) of order n is any vector k Np such that |k| = n. Moreover, conditioning on some cells may immediately lead to an unsatisfiable formula. To avoid unnecessary computation with such cells, we only work with valid cells (van Bremen and Kuˇzelka 2021a). Definition 5. A valid cell of a first-order formula ϕ(x, y) is a cell of ϕ(x, y) and is also a model of ϕ(x, x). Example 2. Consider ϕ = F(x, y) (G(x) H(x)). Cells setting both G(x) and H(x) to false are not valid cells of ϕ. Let us now introduce some notation for conditioning on particular (valid) cells. Denote ψij(x, y) = ψ(x, y) ψ(y, x) Ci(x) Cj(y), ψk(x) = ψ(x, x) Ck(x), and define rij = WMC(ψij(A, B), w , w ), (1) wk = WMC(ψk(A), w, w), (2) where A, B and the weights w , w are the same as w, w except for the atoms appearing in the cells conditioned on. Those weights are set to one, since the weights of the unary and binary reflexive atoms are already accounted for in the wk terms. Finally, we can write WFOMC(ϕ, n, w, w) = X i,j [p]:i