# measuring_the_likelihood_of_numerical_constraints__70494cfe.pdf

Measuring the Likelihood of Numerical Constraints

Marco Console , Matthias Hofer and Leonid Libkin School of Informatics, University of Edinburgh {mconsole, mhofer, libkin}@inf.ed.ac.uk

Our goal is to measure the likelihood of the satisfaction of numerical constraints in the absence of prior information. We study expressive constraints, involving arithmetic and complex numerical functions, and even quantiﬁcation over numbers. Such problems arise in processing incomplete data, or analyzing conditions in programs without a priori bounds on variables. We show that for constraints on n variables, the proper way to deﬁne such a measure is as the limit of the part of the n-dimensional ball that consists of points satisfying the constraints, when the radius increases. We prove that the existence of such a limit is closely related to the notion of o-minimality from model theory. Thus, for constraints deﬁnable with the usual arithmetic and exponentiation, the likelihood is well deﬁned, but adding trigonometric functions is problematic. We look at computing and approximating such likelihoods for order and linear constraints, and prove an impossibility result for approximating with multiplicative error. However, as the likelihood is a number between 0 and 1, an approximation scheme with additive error is acceptable, and we give it for arbitrary linear constraints.

1 Introduction Our goal is to measure the likelihood of numerical constraints without having any additional knowledge such as a probability distribution. A constraint over n variables can be seen as a map ϕ( x) : Rn {0, 1} or a subset ϕ(Rn) = { a Rn | ϕ( a) = 1} of tuples in Rn satisfying the constraint. To express them, we can use standard arithmetic operations +, , , , and even more complex ones such as ex, ln x. For example, x2 < y < ex is a constraint, and we can even extend the language with quantiﬁers and write constraints such as z (ez x = z y2). For a constraint ϕ( x), we would like to measure how likely it is to be satisﬁed. If we are only interested in a bounded subset of Rn, we can just compute the volume of the intersection of ϕ(Rn) with that subset. Without explicit bounds, if we have a probability distribution p( x) on Rn, this likelihood is just the expected value of ϕ, i.e., R

Rn ϕ( x)p( x)d x.

But what happens when we have neither explicit bounds nor a probability distribution? Since there is no uniform distribution over the entire Rn, we would like to, instead, estimate the portion that ϕ(Rn) cuts from Rn, and view it as the likelihood of ϕ. But while it is intuitively clear that a tautological constraint like 1 > 0 is likelier than x 1 which in turn is likelier than x = 0, how do we formally explain this? The motivation for these questions comes from several different areas. One is program analysis, whose relationship with constraint solving is discussed in [Gulwani et al., 2008]. Likelihoods of a branch being taken, and more generally, of an execution path of a program, have been associated with volumes of certain sets in Rn ([Ma et al., 2009]). This correspondence only works when there are explicit bounds on input variables (and thus volumes are ﬁnite). However, ad hoc bounds on variables can affect likelihoods drastically. To reason about execution paths without such bounds, we need to estimate likelihoods of arbitrary constraints over Rn. Another motivating example comes from running queries on incomplete data, i.e., data with explicit markers saying that some values are unknown (e.g., nulls in databases). If we want to estimate the likelihood of answers to queries over such data, we need to estimate the likelihood of conditions speciﬁed in queries. This is useful for estimating how close an answer is to certainty, as was shown in [Libkin, 2018]. That work, however, only considered equality constraints, while database queries often use more complex conditions. How to estimate the likelihood of these conditions is yet to be deﬁned. The third motivation comes from temporal constraint satisfaction ([Dechter et al., 1991]). If we are not interested in mere satisﬁability, but we want to estimate how likely the events described by a temporal constraint network are, we need to estimate the proportion of Rn that its constraints deﬁne. As a ﬁnal example, a continuous analog of weighted model counting for propositional formulae ([Chavira and Darwiche, 2008]) was proposed in [Belle et al., 2015]. The idea behind it is to compute volumes of sets of continuous parameters of models; but these volumes must be ﬁnite and hence parameters bounded. To lift this restriction, one would need to replace such volumes by measures of likelihoods of unbounded sets.

Thus, the question we are addressing is not only mathematically natural, but also relevant in several different scenarios. We now look at a few examples to indicate that the question is not trivial. Start with a simple constraint x y over R2. It

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

Figure 1: Illustration to the examples in the introduction

seems quite obvious that if we know nothing about x and y, its likelihood is 1/2. But what about x 2y? On a quick reﬂection one realizes that nothing changes: for x y we had symmetry along the y = x line, and now we have symmetry along the y = x/2 line, so the likelihood is still 1/2. But now change things a bit and consider ϕ(x, y) = (x 2y) (x, y 0). If we consider an explicit bound r x, y r, for some r > 0, then this constraint cuts 1 16-th of the square [ r, r]2, see Fig. 1 (a). More precisely, λ(ϕ(R2) [ r, r]2)/λ([ r, r]2) = 1 16, where λ( ) refers to the Lebesgue measure (area in 2d, volume in 3d). This looks like a sensible way of deﬁning the proportion we are interested in, until we apply a simple isometry transformation which should not change the value. In this case, rotate ϕ(R2) anticlockwise β degrees where β = π/4 arctan(1/2)/2. For the new set Cβ, shown in Fig. 1 (b), we have λ(Cβ [ r, r]2)/λ([ r, r]2) = (1 tan(β))/4 = 1/16. So, considering identical bounds [ r, r] along both axes does not work well. What if we use the radius r ball Bn r instead? This looks like a better idea: in this case, λ(ϕ(R2) B2 r)/λ(B2 r) = arctan(1/2)/2π, see Fig. 1 (c). In fact every rotation or an isometry in general, applied to ϕ(R2) (for instance, one resulting in Cβ) will give us the same result. Thus, using balls to measure the likelihood leads to more intuitive results. However, in the previous example we were lucky as the ratio did not depend on the radius r. This is not always so: consider a slight modiﬁcation of ϕ, namely ϕ (x, y) = (x 2y) (x, y 1), see Fig. 1 (d). Then λ(ϕ (R2) B2 r)/λ(B2 r) = arctan(1/2)/2π 1/πr2. How do we get rid of the dependence on the radius? The idea, similarly to what is done in the study of asymptotic behavior of properties deﬁned by logical sentences ([Spencer, 2010]) is to consider the limit as r increases. In this case limr λ(ϕ (R2) B2 r)/λ(B2 r) = arctan(1/2)/2π. Intuitively, the change from x, y 0 to x, y 1 affects a very small portion of the set deﬁned by ϕ , and can be disregarded. This is the proper way to deﬁne the likelihood that we shall follow. However, there is a caveat: do we know that the limit always exists? In fact it does not: consider a new constraint ϕ (x) saying that 22k |x| 22k+1 for some k N. In this case, as we shall formally prove later, the limit does not exist, and there seems to be no good way of deﬁning the likelihood (in fact the likelihood will be oscillating in the [ 1

3 ε] interval, with decreasing ε as k increases).

These simple examples show that the question of deﬁning the likelihood of numerical constraints is far from trivial and several questions need to be answered.

How do we deﬁne it properly? The examples seem to indicate that the ball behaves better than the cube. Note that those were deﬁning sets a 2 r and a r for the Euclidean norm 2 and the inﬁnity norm . But there are many other norms one could consider.

How do we know that the limit exists and the likelihood of a constraint is well deﬁned?

If the limit does exist and the likelihood is well deﬁned, can we compute it? If it is not a rational number, like arctan(1/2)/2π in the example, can we approximate it, and how efﬁciently can we do so?

We answer these questions and establish the following results. First, we formally deﬁne the likelihood µ(ϕ) for numerical constraints ϕ and show that this can only be properly done when one considers the Euclidean norm 2. We then address the question of existence of the limit. We prove that it exists for arbitrary constraints deﬁnable in the ﬁrst-order theory of the structure R, +, , ex . In particular such constraints can use +, , , , ex, ln x. This relies on deep results in logic, related to the notion of o-minimal structures ([Van den Dries, 1998]). O-minimality states that subsets of R deﬁnable in ﬁrst-order logic are ﬁnite unions of intervals. This condition is violated by adding, for instance, a predicate N( ) for natural numbers, or the sine function. We prove that adding either a test for natural numbers, or any trigonometric function leads to a situation when the limit does not exist and the likelihood cannot be deﬁned. Finally, we look at computing and approximating such likelihoods. Even in the simplest case of order constraints, ﬁnding the exact likelihood is intractable. For linear constraints, the exact likelihoods, even for simplest constraints, are usually irrational numbers, and thus need to be approximated. We consider approximations of two standard kinds: with multiplicative error, and with additive error, see [Motwani and Raghavan, 1995]. We prove that the former does not exist under widely held complexity-theoretic assumptions. However, as we are approximating numbers between 0 and 1, approximation schemes with additive error are acceptable. We present such a schema for arbitrary linear constraints. Organization. Basic notations are given in Section 2. The deﬁnition of the likelihood of constraints is given in Section 3.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

Its existence is studied in Section 4. Complexity of computing and approximating likelihood is investigated in Section 5 for order constraints, and in Section 6 for linear constraints. Conclusions are in Section 7.

2 Preliminaries

We use N, Z, Q, and R to denote natural, integer, rational, and real numbers, and superscript n to denote n-tuples of those, also referred to as (n-dimensional) points. For each p 1, the p-norm a p of a Rn is deﬁned as (Pn i=1 |ai|p)1/p. Furthermore, a = limp a p = max{|a1|, . . . , |an|}. If p is omitted, we implicitly assume the Euclidean norm, i.e., p = 2 and a = a 2. For c Rn and r R, the n-dimensional p-ball of radius r and center c is Bn p,r( c) = { a Rn | a c p r}. If p and c are omitted, we assume p = 2 and c = 0, i.e., Bn r = Bn 2,r( 0). If S Rn, we denote by λn(S) the n-dimensional Lebesgue measure of S. When n is clear from the context, we write λ(S) for the n-dimensional Lebesgue measure of S. The Lebesgue measure is the unique extension of the standard notion of length, area and volume, in 1, 2 and 3 dimensions, to the n-dimensional case.

Constraint Languages We consider ﬁrst-order languages over structures on numbers, mainly the reals. Let Ωbe the vocabulary containing function and predicate symbols. Function symbols will be standard numerical functions such as +, , ex, ln x, sin x, etc, and, likewise, predicates are the standard predicates, such as =, <, etc. Constants are viewed as functions of arity zero. Firstorder formulae in the language of Ωare deﬁned in the standard way. Assume a countably inﬁnite set VAR of symbols, called set of variables. Each variable and each constant is a term. Then, if t1, . . . , tn are terms, and f is a function symbol of arity n, f(t1, . . . , tn) is a term, too. If P is a k-ary predicate symbol from Ω, and t1, . . . , tk are terms, P(t1, . . . , tk) is an atomic formulae. Furthermore, if ϕ and ψ are formulae, ϕ ψ, ϕ ψ, ϕ are formulae, as are x ϕ and x ϕ. The deﬁnitions of bound and free variables are standard. We write t( x) and ϕ( x) if the (distinct) free variables of t and ϕ are x, and say that ϕ is of arity n (or n-ary) if it contains exactly n distinct free variables. Semantics for terms and formulae in FO(Ω) is deﬁned with respect to a structure M = R, Ω and an assignment a Rn for free variables x VARn. For the value of a term, we write t( a). For example, given a term t(x, y) = ex + y, the value t(0, 1) is 2. We say that ϕ( a) is true in M when it is satisﬁed under the standard ﬁrst-order semantics. For example, ϕ(u, v, w) = x (u x x+v x+w = 0) is formula in three free variables; ϕ(a, b, c) is true iff b2 4ac 0. For a formula ϕ( x) with x VARn, let Ωϕ contain all the function and predicate symbols used in ϕ. Deﬁne

JϕK = { a Rn | ϕ( a) is true in R, Ωϕ } .

Since we assume the standard semantics of numerical functions and predicates, the above deﬁnition does not depend on the exact vocabulary Ωof a structure as long as Ωϕ Ω. Indeed, then ϕ( a) is true in R, Ω iff it is true in R, Ωϕ .

3 Deﬁning the Likelihood

To deﬁne the likelihood of a constraint given by a formula ϕ( x) over R, Ω , we need, as explained in the introduction, to estimate the portion of Rn occupied by JϕK. We ﬁrst look at the well-deﬁned portion of Bn p,r = { a Rn | a p r} occupied by JϕK in the radius r ball (with respect to the p norm), namely λ(JϕK Bn p,r)/λ(Bn p,r), and then consider its asymptotic behavior as the bound r increases. This gives us the following deﬁnition.

Deﬁnition 1. Let ϕ( x) be a formula with n-free variables over the vocabulary Ω. The p-likelihood of ϕ is deﬁned as:

µp(ϕ) = lim r λ(JϕK Bn p,r) λ(Bnp,r)

To answer the question which value p to choose, note that we do not expect the measure to change when simple volumepreserving transformations, such as rotations or reﬂections, are applied to JϕK. These in general fall in the class of isometries, i.e., linear transformations given by orthogonal matrices M that satisfy M T = M 1 (i.e. the transpose of M equals its inverse). Note that, as long as Ωcontains + and , the set MJϕK is deﬁnable in the structure R, Ω , i.e., there is a formula Mϕ such that JMϕK = MJϕK. We next show that the only norm, for which applying isometries does not change the likelihood µp, is the Euclidean norm. More precisely, we say that the likelihood µp is well-behaved if, for every structure R, +, , . . . , every formula ϕ( x), and every orthogonal matrix M, we have µp(ϕ) = µp(Mϕ). Since µp is deﬁned as the limit, the equality means that, if one side of the equation exists, then so is the other, and they are both equal.

Theorem 1. The likelihood µp is well-behaved iff p = 2, i.e., only for the Euclidean norm.

Proof idea. For all orthogonal matrices M and measurable sets Z Rn, we have λ(Z) = λ(MZ). However MBn p,r = Bn p,r does not hold for all orthogonal matrices M unless p = 2. Hence, in general, also M(S Bn p,r) = MS Bn p,r, for p = 2. Therefore, λ(S Bn p,r) = λ(M(S Bn p,r)) = λ(MS Bn p,r) for all orthogonal matrices M, unless p = 2.

In view of this, we now concentrate on µ2(ϕ), and simply write µ(ϕ), omitting the subscript 2.

4 Existence of the Likelihood

Due to the way the likelihood is deﬁned, a very natural question to ask is whether µ(ϕ) exists for all constraints ϕ deﬁnable in R, Ω . We now show that the existence of the limit in Deﬁnition 1 is closely related to the subsets of R deﬁnable in R, Ω . In fact, the existence of these limits is closely related to the property of the structure known as o-minimality, cf. [Van den Dries, 1998]. We use this to prove the existence results for very expressive constraints that can use functions such as +, , , , ex, ln x and others. We then show that the most common ways of violating o-minimality, such as adding a test for natural numbers of trigonometric functions, lead to situations where the limit does not exist.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

Existence Results The property of o-minimality states that for every formula in one free variable ϕ(x), the set ϕ(R) R is a ﬁnite union of intervals. This is easily seen to be true for the real ﬁeld R, +, , 0, 1, < . By quantiﬁer-elimination (cf. [Caviness and Johnson, 1998]), every formula ϕ(x) is equivalent to a Boolean combination of statements p1(x) > 0, . . . , pm(x) > 0, where all the pis are polynomials. Polynomials that are not identically zero have ﬁnitely many roots. Let r1 < r2 < . . . < rk be all the roots of polynomials p1(x), . . . , pm(x); then in each interval (ri, ri+1), the truth of ϕ cannot change. We now consider an expansion of this structure, the exponential ﬁeld R, +, , ex . Let us brieﬂy list what is deﬁnable in FO(<, +, , ex), to understand the power of these constraints. Constants 0 and 1 are deﬁnable as units of + and ; for example ζ1(x) = y (y x = y) deﬁnes 1. Every natural number is deﬁnable by a formula ζℓ(x) = (x = 1 + . . . + 1), where the summation is taken ℓtimes, and every rational number n/m is deﬁnable too by ζn/m(x) = y z (ζn(y) ζm(z) x z = y). Inverse operations , , ln are deﬁnable (e.g., x y = z iff x = y + z) and for every rational s, the function sx is deﬁnable since sx = eln s x and both ln and arbitrary rational numbers are deﬁnable. Deﬁnability of a partial function means that its graph is deﬁnable by an FO formula. A remarkable result by [Wilkie, 1996] showed that R, +, , ex is o-minimal. Thus, the set N is not deﬁnable in it, nor are trigonometric functions sin and cos, as with them one can deﬁne sets that are not ﬁnite unions of intervals.

Theorem 2. For every formula ϕ FO(<, +, , ex), the likelihood µ(ϕ) exists.

Proof sketch. Given a formula ϕ( x) where x = (x1, . . . , xn), consider a new formula ψ(y, x) = ϕ( x) x y (the latter condition is deﬁnable with + and ). Then ψ(r, a) holds iff a JϕK Bn r . For a ﬁxed ε > 0, a formula βε(y, z) is an ε-volume approximation for ψ if for each r, there is v such that βε(r, v) holds, and moreover, whenever βε(r, v) holds, it is the case that |v λ(ψ(r, Rn))| < ε, where ψ(r, Rn) is the set of all a Rn such that ψ(r, a) holds. Using ﬁniteness of VC dimension of deﬁnable families in o-minimal structures [Van den Dries, 1998], it was shown in [Karpinski and Macintyre, 1997; Koiran, 1995] that εapproximations of volume exists for every ε. (In fact the results there even show that whenever |v λ(ψ(r, Rn))| < ε/4, then βε(r, v) holds). Next we use the fact that o-minimal functions admit deﬁnable Skolemization, i.e., there is a deﬁnable function (a function with the deﬁnable graph) fε : R R such that βε(r, fε(r)) holds for every r. Recall that λ(Bn r ) = cn rn where cn = πn/2/Γ(n/2+1). Consider the function Fε(r) = fε(r)/rn, still deﬁnable. Then |Fε(r) cn λ(ψ(r, Rn))/λ(Bn r )| ε/rn . Since λ(ψ(r, Rn)) < λ(Bn r ), it tells us that Fε is bounded when r . From o-minimality, we know that for Fε, as a deﬁnable function, there exists r0 R such that Fε is continuous and either monotonically increasing, or decreasing, for all r > r0. Thus, limr Fε(r) exists, and therefore

by the above, limr c λ(ψ(r, Rn))/λ(Bn r ) exists. This means that µ(ϕ) = limr λ(ψ(r, Rn))/λ(Bn r ) exists.

The crucial ingredients of the proof are o-minimality and deﬁnability of volume approximations. The latter works in general for o-minimal expansions of the real ﬁeld [Karpinski and Macintyre, 1997], of which R, +, , ex is the most prominent example. Theorem 2 merely establishes the existence of the limit, but does not give a good approach to calculating it yet. Indeed, to approximate the limit, one would need to explicitly construct volume approximating formulae, and check their satisﬁability. There are two obstacles to this approach in general. First, in the case of FO(<, +, , ex) formulae, their decidability is not yet known: it was proved [Macintyre and Wilkie, 1996] under the assumption that Schanuel s conjecture a major unsolved problem in number theory holds. For the case when we have decidability of the real ﬁeld R, +, , < , the bounds for ε-volume approximations can be huge: [Benedikt and Libkin, 2002] showed how to construct approximating formulae for very simple geometric shapes and ε = 0.1 that would involve over 1010 quantiﬁers. We shall propose a much better behaved approximation scheme, but only for the linear case, without multiplication.

Non-existence Results

O-minimality is the essential condition for proving the existence result. What happens without o-minimality? Of course there are many ways to produce structures that are not ominimal, but it is known that any expansion of the real ﬁeld with nice functions, that is not o-minimal, deﬁnes either the set of natural numbers or trigonometric functions. Speciﬁcally, the following was shown in [Miller, 2011]. If we look at the expansion of R, +, with functions f : R Rn that are solutions to f (x) = Mf(x) for some n n matrix M, and the resulting expansion is not o-minimal, then one of the following is deﬁnable in this structure: either the set N, or the logarithmic spiral {(ex cos(ax), ex sin(ax)) | x R}. If the latter is added to FO(<, +, , ex), then both sin and cos are deﬁnable. In view of this, we consider non-o-minimal structures with a predicate for natural numbers, or a trigonometric function, and show that in them, the likelihood µ(ϕ) may not exist.

Theorem 3. Let N( ) be the unary predicate for natural numbers, and f any trigonometric function (sin, cos, tan, cot). Then in both FO(+, , N) and FO(+, , ex, f) one can ﬁnd formulae ϕ such that µ(ϕ) does not exist.

Proof sketch. If unary predicate N is available, with + and one can deﬁne every computable predicate over N. Hence one can deﬁne the following constraint: ϕ(x) = k (22k |x|) (|x| 22k+1) N(k) . Note that in one dimension, B1 r = [ r, r] and λ(B1 r) = 2r. Then one can show that λ(JϕK B1 22k+i)/λ(B1 22k+i) = (1 + i)/3 + o(k 1) for i = 0, 1, proving that limr λ(JϕK B1 r)/λ(B1 r) does not exist. If we have any trigonometric function among cos, tan, cot, then in FO(+, ) we can deﬁne sin(x). Next consider

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

ϕ(x, y) = (y > 0) (x > 0) (y x sin(ln(x)). Then

λ(JϕK B2 r) λ(B2r) = 1

r) if r = e(2m+1)π, 1 5πe2π + o( 1

r) if r = e(2m+2)π,

for m N. Thus limr λ(JϕK B2 r) λ(B2 r) does not exist.

5 Order Constraints

We have seen that the likelihood µ(ϕ) exists for all the formulae of FO(<, +, , ex), although the existence proof does not lend itself to a natural algorithm for ﬁnding µ(ϕ). We thus now look at concrete algorithms for simpler constraints, starting with the basic case of FO(<). Our results indicate hardness of the exact computation, thus motivating the need to look for approximations. To start with, we make the following easy observation.

Proposition 1. If ϕ FO(<), then µ(ϕ) is a computable rational number.

Proof sketch. Consider ϕ(x1, . . . , xn). For each permutation π on {1, . . . , n}, let χπ( x) be V i n 1(xπ(i) < xπ(i+1)). We can check, using decidability of R, < , whether χπ implies ϕ. Then µ(ϕ) is the number of all π satisfying this condition, divided by n! (since orderings of variables in which two are equal do not contribute to the n-dimensional measure).

We now address the exact complexity of computing µ(ϕ). Since it is a number, we must deal with counting complexity classes. Recall that #P is class of function problems that count the number of solutions of an NP problem, cf. [Arora and Barak, 2009]. Unless P = NP, problems which are hard for this class are intractable. Using this, we obtain the following lower bound for computing µ(ϕ) over FO(<).

Theorem 4. Computing µ(ϕ) for ϕ FO(<) is #P-hard, even for quantiﬁer-free formulae in disjunctive normal form.

Proof sketch. We reduce from the #P-hard problem #DNF: for a propositional formula α in DNF over variables VAR(α), count the number #α of truth assignments that satisfy α. We construct an FO formula ϕα such that µ(ϕα) = f(#α), and both ϕα and function f are polytime-computable. For this sketch, we assume that constant symbols are allowed in ϕα. We assign to each v VAR(α) a distinct (numerical) variable x VAR. To deﬁne ϕα, for every clause (l1, . . . , lm) of α, assume that the variable of each literal li is vi and that xi is the (numerical) variable assigned to vi. For each (l1, . . . , lm) of α, the formula ϕα contains a disjunct (a1 am) where each ai is equal to (xi > 0) if li is a positive literal and ai is equal to (xi < 0) if li is a negative literal. Note that ϕα is a DNF formula with free variables x1, . . . , xn. One can show that µ(ϕα) equals the number of orthants where ϕα is satisﬁed, divided by 2n. The claim follows since µ(ϕα) 2n can be computed in polytime with respect to the size of α.

The decision version of the problem is intractable too.

Proposition 2. Deciding whether µ(ϕ) > 0 is NP-complete, and deciding whether µ(ϕ) = 1 is CONP-complete for quantiﬁer-free formulae ϕ FO(<).

Proof sketch. The construction in the previous proof shows that a DNF formula α is a tautology iff µ(ϕα) = 1, proving hardness. For membership, notice that for ϕ FO(<), µ(ϕ) > 0 (or equals one) iff some (every) strict linear ordering of the free variables of ϕ satisﬁes ϕ. This linear ordering can then be guessed and checked for satisfying ϕ.

6 Linear Constraints

We have seen that computing µ(ϕ) is hard for order constraints, which suggests that, as is common in such situations, one approximates it instead. When we move to more expressive linear constraints, i.e., FO(<, +), there is another compelling reason for looking for approximations: numbers µ(ϕ) may not be rational. In fact, for proving #P-hardness of computing µ(ϕ) we needed a large number of variables, but in the linear case, irrational values appear already with very simple constraints over two variables.

Proposition 3. Let ϕa(x, y) be (y ax) (x 0), where a Q. Then µ(ϕa) is irrational unless a {0, 1}.

Proof sketch. We have µ(ϕa) = arctan(a)/2π + 1/4. Assume arctan(a) = mπ

n for m, n N, i.e., µ(ϕa) Q. Using Euler s formula we compute cos( mπ

n ) i sin( mπ

n ) 2n = e 2πmi = 1. Therefore cos( mπ

n ) i sin( mπ

n ) are algebraic integers, i.e., roots of a polynomial with integer coefﬁcients and leading coefﬁcient 1. Using closure of algebraic integers under + and we see that 4 cos2( mπ

n ) is an algebraic integer, and by trigonometry it equals 4/(a2 + 1). Since a Q and algebraic integers in Q are integers [Marcus, 1977], it means 4/(a2 + 1) Z, which implies a = 0 or a = 1.

For approximations, the standard notions of approximations are fully-polynomial randomized approximation schemes, FPRAS [Motwani and Raghavan, 1995]. For a problem of computing a function f, it is a randomized algorithm that takes an input I of f and ε > 0, and, in time polynomial in the size of I and 1

ε, produces a random variable A(I, ε) satisfying:

P |A(I, ε) f(I)| ε f(I) 3

4 (multiplicative error, in which case we refer to an FPRAS for the problem); or

P |A(I, ε) f(I)| ε 3

4 (additive error, in which case we refer to an AFPRAS for the problem).

The constant 3/4 can be replaced by any number in (1/2, 1). From the results presented in Proposition 2, we get, under the widely held complexity-theoretic assumption that RP is different from NP:

Corollary 1. There is no FPRAS for computing µ(ϕ) even for order constraints from FO(<), assuming RP = NP.

In general, FPRAS is viewed as a preferred type of approximation scheme, since there are no a priori bounds on the values f(I). But in our case µ(ϕ) [0, 1], and in such cases AFPRAS provide a good notion of approximation. We now show that such approximations exist for FO(<, +). Note that every formula over FO(<, +, , 0, 1) can be effectively transformed, using Fourier-Motzkin elimination, into a quantiﬁerfree formula; such a transformation is polynomial in the for-

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

mula and exponential in the number of variables (assuming a ﬁxed quantiﬁer alternation depth) [Weispfenning, 1988]. We thus assume that our linear constraints are quantiﬁerfree formulae in disjunctive normal form with rational coefﬁcients. This subclass of FO(<, +) will be denoted by DNF(<, +). Its formulae express unions of systems of linear equalities an inequalities. For the input representation, we assume that a formula is represented as the set of its disjuncts. In turn, each system of linear inequalities will be represented as a pair of a matrix and a vector in the usual way. Note that computing µ(ϕ) remains #P-hard for formulae DNF(<, +), as follows from Theorem 4. Thus, the assumption does not simplify the task of ﬁnding an approximation scheme but instead allows us to focus on the intrinsic source of complexity of the problem. Our main result then states:

Theorem 5. There is an AFPRAS for computing µ(ϕ) for ϕ DNF(<, +).

In the rest of the section we explain how our AFPRAS works. We ﬁrst need to set the computational model for the algorithm. For problems of a geometric nature, it is standard to use the RAM model of computation. In it, all the basic arithmetic operations are performed in constant time. As it is customary, we will also assume that the machine can uniformly at random select a real number in the range [0, 1] in constant time (see, e.g., [Bringmann and Friedrich, 2010]). Intuitively, to compute the likelihood of a formula ϕ DNF(<, +) we need to examine the asymptotic behavior of its disjuncts. A system of linear inequalities in n variables (i.e., an n-dimensional polyhedron) can be represented as the Minkowski sum of two sets: the convex hull conv(E) of a ﬁnite set E Rn, and the conic hull cone(C) of a ﬁnite set C Rn. This is due to the Minkowski-Weyl Theorem ([Solodovnikov, 1980]). Of these two sets, the only one that matters for the likelihood of ϕ DNF(<, +) is cone(C). This is due to the fact that, at the limit, the ﬁnite diameter of conv(E) will be eventually marginalized. Using this observation, we can prove the following statement. In what follows, ϕ denotes the formula obtained from ϕ DNF(<, +) by replacing each disjunct ϕi (A x b) with its homogeneous version ϕi (A x 0).

Lemma 1. For ϕ DNF(<, +) we have µ(ϕ) = µ( ϕ).

Proof idea. To obtain ϕi from ϕi, we shift each individual hyperplane of ϕi. These shifts preserve the slope of the hyperplanes, and thus the angles deﬁning cone(C).

In light of Lemma 1, to compute the likelihood of ϕ DNF(<, +) we focus directly on ϕ. This observation makes the problem much easier, as the following lemma proves.

Lemma 2. For an n-ary formula ϕ DNF(<, +), we have µ(ϕ) = λ(J ϕK Bn 1 )/λ(Bn 1 ).

Proof idea. For an homogeneous system, cone(C) is centered in the origin. The ratio λ(J ϕK Bn r )/λ(Bn r ), then, does not depend on the value of r.

Using Lemma 2, to estimate the value of µ(ϕ) we can proceed as follows. Given any ε > 0, our goal is to produce a value A(ϕ, ε) such that P(|A(ϕ, ε) µ(ϕ)| ε) 3

Algorithm 1 apx-mes

Input: n-ary ϕ DNF(<, +) and ε > 0 Output: ξ R such that P(|ξ µ(ϕ))| ε) 3/4

1: count := 0; s = ε 2 ; 2: for i = 1, ... , s do 3: sample a point from Bn 1 . 4: if ϕ( p) is true then count := count + 1; 5: end for 6: return count

this end, we model A(ϕ, ε) as a (scaled) binomial random variable As as follows. Let (Xi)1 i s be a sequence of independent and identically distributed Bernoulli random variables over the measure space (Bn 1 , F(Bn 1 )), where F(Bn 1 ) is the standard Lebesgue σ-algebra over Bn 1 . More precisely, we require that Xi( v) = 1 if and only if ϕ( v) is true. Deﬁning As = (Ps i=1 Xi)/s, we can prove the following. Lemma 3. Let ϕ DNF(<, +) and ε (0, 1]. Then, for s ε 2 the following inequality holds:

P(|As µ(ϕ))| ε) 3/4

Using Lemma 3, we can obtain an AFPRAS for µ(ϕ) as follows. First, observe that under the assumption of the RAM model we can generate uniform samples from the ndimensional unit ball in time polynomial w.r.t. n (see, e.g., [Blum et al., 2016]). To estimate the value of µ within an additive error ε then, we can sample s = ε 2 such points and compute the ratio of these points falling inside JϕK. Algorithm 1 formalizes this intuition. Lemma 4. Algorithm 1 runs in time polynomial in the size of ϕ and 1/ε. Lemma 3 and Lemma 4 together prove that Algorithm 1 is an AFPRAS for the computation of the likelihood of formulae in DNF(<, +), thus concluding the proof of Theorem 5.

7 Conclusions We provided a framework for estimating likelihoods of numerical constraints deﬁning unbounded sets in Rn, showed when such estimates exist, and explained how to approximate them for linear constraints. There are several extensions of the framework we would like to pursue. One is the extension of the approximation scheme to polynomial constraints. Another direction is related to application of these estimate in different areas, such as program analysis, querying incomplete data, and temporal constraint networks, outlined in the introduction. Finally we would like to consider constraints over Zn. We expect that our results will help estimate the likelihood of constraints over integers, since for many bodies (e.g., balls), the number of integer lattice points can serve as an approximation of the volume.

Acknowledgments Work partly supported by EPSRC grants M025268 and N023056.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

References [Arora and Barak, 2009] Sanjeev Arora and Boaz Barak. Computational Complexity - A Modern Approach. Cambridge University Press, 2009. [Belle et al., 2015] Vaishak Belle, Andrea Passerini, and Guy Van den Broeck. Probabilistic inference in hybrid domains by weighted model integration. In IJCAI, pages 2770 2776, 2015. [Benedikt and Libkin, 2002] Michael Benedikt and Leonid Libkin. Aggregate operators in constraint query languages. J. Comput. Syst. Sci., 64(3):628 654, 2002. [Blum et al., 2016] Avrim Blum, John Hopcroft, and Ravindran Kannan. Foundations of data science. Vorabversion eines Lehrbuchs, 2016. [Bringmann and Friedrich, 2010] Karl Bringmann and Tobias Friedrich. Approximating the volume of unions and intersections of high-dimensional geometric objects. Comput. Geom., 43(6-7):601 610, 2010. [Caviness and Johnson, 1998] Bob Caviness and Jeremy Johnson. Quantiﬁer Elimination and Cylindrical Algebraic Decomposition. Springer, 1998. [Chavira and Darwiche, 2008] Mark Chavira and Adnan Darwiche. On probabilistic inference by weighted model counting. Artif. Intell., 172(6-7):772 799, 2008. [Dechter et al., 1991] Rina Dechter, Itay Meiri, and Judea Pearl. Temporal constraint networks. Artif. Intell., 49(13):61 95, 1991. [Gulwani et al., 2008] Sumit Gulwani, Saurabh Srivastava, and Ramarathnam Venkatesan. Program analysis as constraint solving. In PLDI, pages 281 292. ACM, 2008. [Karpinski and Macintyre, 1997] Marek Karpinski and Angus Macintyre. Approximating the volume of general Pfafﬁan bodies. In Structures in Logic and Computer Science, A Selection of Essays in Honor of Andrzej Ehrenfeucht, pages 162 173, 1997. [Koiran, 1995] Pascal Koiran. Approximating the volume of deﬁnable sets. In FOCS, pages 134 141, 1995. [Libkin, 2018] Leonid Libkin. Certain answers meet zeroone laws. In PODS, pages 195 207, 2018. [Ma et al., 2009] Feifei Ma, Sheng Liu, and Jian Zhang. Volume computation for boolean combination of linear arithmetic constraints. In CADE, pages 453 468, 2009. [Macintyre and Wilkie, 1996] Angus Macintyre and Alex Wilkie. On the decidability of the real exponential ﬁeld. In Kreiseliana, pages 441 467. A. K. Peters, 1996. [Marcus, 1977] Daniel Marcus. Number Fields. Springer, 1977. [Miller, 2011] Chris Miller. Expansions of o-minimal structures on the real ﬁeld by trajectories of linear vector ﬁelds. Proceedings of the American Mathematical Society, 139(1):319 330, 2011. [Motwani and Raghavan, 1995] Rajeev Motwani and Prabhakar Raghavan. Randomized Algorithms. Cambridge University Press, 1995.

[Solodovnikov, 1980] Aleksandr Solodovnikov. Systems of Linear Inequalities. University of Chicago Press, 1980. [Spencer, 2010] Joel Spencer. The Strange Logic of Random Graphs. Springer, 2010. [Van den Dries, 1998] Lou Van den Dries. Tame Topology and o-minimal Structures, volume 248. Cambridge university press, 1998. [Weispfenning, 1988] Volker Weispfenning. The complexity of linear problems in ﬁelds. J. Symb. Comput., 5(1/2):3 27, 1988. [Wilkie, 1996] Alex J Wilkie. Model completeness results for expansions of the ordered ﬁeld of real numbers by restricted Pfafﬁan functions and the exponential function. Journal of the American Mathematical Society, 9(4):1051 1094, 1996.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)