# qualitative_spatial_logics_for_buffered_geometries__d8ad16fa.pdf

Journal of Artiﬁcial Intelligence Research 56 (2016) 693-745 Submitted 03/16; published 08/16

Qualitative Spatial Logics for Buﬀered Geometries

Heshan Du H.Du@leeds.ac.uk University of Leeds, UK

Natasha Alechina Natasha.Alechina@nottingham.ac.uk University of Nottingham, UK

This paper describes a series of new qualitative spatial logics for checking consistency of same As and part Of matches between spatial objects from diﬀerent geospatial datasets, especially from crowd-sourced datasets. Since geometries in crowd-sourced data are usually not very accurate or precise, we buﬀer geometries by a margin of error or a level of tolerance σ R 0, and deﬁne spatial relations for buﬀered geometries. The spatial logics formalize the notions of buﬀered equal (intuitively corresponding to possibly same As ), buﬀered part of ( possibly part Of ), near ( possibly connected ) and far ( deﬁnitely disconnected ). A sound and complete axiomatisation of each logic is provided with respect to models based on metric spaces. For each of the logics, the satisﬁability problem is shown to be NP-complete. Finally, we brieﬂy describe how the logics are used in a system for generating and debugging matches between spatial objects, and report positive experimental evaluation results for the system.

1. Introduction

The motivation for our work on qualitative spatial logics comes from the needs of integrating disparate geospatial datasets, especially crowd-sourced geospatial datasets. Crowd-sourced data involves non-specialists in data collection, sharing and maintenance. Compared to authoritative geospatial data, which is collected by surveyors or other geodata professionals, crowd-sourced data is less accurate and less well structured, but often provides richer user-based information and reﬂects real world changes more quickly at a much lower cost (Jackson, Rahemtulla, & Morley, 2010). It is in the interests of national mapping agencies, government organisations, and all other users of geospatial data to be able to integrate and use diﬀerent geospatial data synergistically.

Geospatial data matching refers to the problem of establishing correspondences (matches) between spatial objects represented in diﬀerent geospatial datasets. It is an essential step for data comparison, data integration or enrichment, change detection and data update. Matching authoritative geospatial data and crowd-sourced geospatial data is a non-trivial task. Geometry representations of the same location or place in diﬀerent datasets are usually not exactly the same. Objects are also sometimes represented at diﬀerent levels of granularity. For example, consider geometries of objects in Nottingham city centre given by Ordnance Survey of Great Britain (OSGB) (2012) and by Open Street Map (OSM) (2012) in Figure 1. The position and shape of the Prezzo Ristorante are represented diﬀerently in OSGB data (dotted) and OSM data (solid). The Victoria Shopping Centre is represented as a whole in OSM, and as several shops in OSGB.

c 2016 AI Access Foundation. All rights reserved.

Du & Alechina

Figure 1: Prezzo Ristorante and Victoria Shopping Centre represented in OSGB (dotted) and OSM (solid)

In order to integrate the datasets, we need to determine which objects are the same (represent the same entity) and sometimes which objects in one dataset are parts of objects in the other dataset (as in the example of Victoria Shopping Centre). The statements representing these two types of relations are referred to as same As matches and part Of matches respectively. One way to produce such matches is to use locations and geometries of objects, although of course we also use any lexical labels associated with the objects, such as names of restaurants etc. In our previous work (Du, Alechina, Jackson, & Hart, 2016), we present a method which generates matches using both location and lexical information about spatial objects. As generated matches may contain errors, they are seen as retractable assumptions and require further validation and checking. One way is to use logical reasoning to check the consistency of matches with respect to statements in input datasets. By using description logic reasoning, the correctness of matches can be checked with respect to classiﬁcation information. For example, it is wrong to state that spatial objects a and b are the same, if a is a Bank and b is a Clinic, because the concepts Bank and Clinic are disjoint, containing no common elements. However, this is not suﬃcient for validating matches between spatial objects1. For example, two spatial objects which are close to each other in one dataset cannot be matched to two spatial objects which are far away apart in the other dataset, no matter whether they are of the same type or not. Therefore, spatial reasoning is required to validate matches with regard to location information, in addition to description logic reasoning. Spatial logic studies relations between geometrical structures and spatial languages describing them (Aiello, Pratt-Hartmann, & van Benthem, 2007). There are a variety of spatial relations, such as topological connectedness of regions, relations based on distances, relations for expressing orientations or directions, etc. In a spatial logic, spatial relations are represented in a formal language, such as ﬁrst order logic or its fragments, and inter-

1. There are works (Lutz & Milicic, 2007) on extending description logics with concrete domains or constraint systems, such as the region connection calculus (RCC) (Randell, Cui, & Cohn, 1992) and Allen s Interval Algebra (Allen, 1983). The description logic reasoner Pellet (Sirin, Parsia, Grau, Kalyanpur, & Katz, 2007) was extended to Pellet Spatial (Stocker & Sirin, 2009), which supports qualitative spatial reasoning in RCC. However, later we will show that it is not appropriate to use RCC in our application.

Qualitative Spatial Logics for Buffered Geometries

preted over some structures based on geometrical spaces, such as topological spaces, metric spaces and Euclidean spaces. In the ﬁeld of qualitative spatial reasoning, several spatial formalisms have been developed for representing and reasoning about topological relations, such as the Region Connection Calculus (RCC) (Randell et al., 1992), the 9-intersection model (Egenhofer & Franzosa, 1991) and their extensions (Clementini & Felice, 1997; Roy & Stell, 2001; Schockaert, Cock, Cornelis, & Kerre, 2008b, 2008a; Schockaert, Cock, & Kerre, 2009). In addition, there are formalisms for representing and reasoning about directional relations (Frank, 1991, 1996; Ligozat, 1998; Balbiani, Condotta, & del Cerro, 1999; Goyal & Egenhofer, 2001; Skiadopoulos & Koubarakis, 2004), as well as relative or absolute distances (Zimmermann, 1995; Clementini, Felice, & Hern andez, 1997; Wolter & Zakharyaschev, 2003, 2005). Recent comprehensive surveys on qualitative spatial representations and reasoning are provided by Cohn and Renz (2008) and Chen, Cohn, Liu, Wang, Ou Yang, and Yu (2015). Qualitative spatial reasoning has been shown to be applicable to geospatial data (Bennett, 1996; Bennett, Cohn, & Isli, 1997; Guesgen & Albrecht, 2000; Mallenby, 2007; Mallenby & Bennett, 2007; Li, Liu, & Wang, 2013), where location information of spatial objects comes from a single data source. The application described in this paper is diﬀerent, as location representations about the same spatial object come from diﬀerent sources and are usually not exactly the same. Rather than treating all the diﬀerences in geometric representations as logical contradictions, we would tolerate slight geometric diﬀerences and only treat qualitatively deﬁned large diﬀerences as logical contradictions used for detecting wrong matches. More speciﬁcally, after establishing matches between two sets of spatial objects, if the set of matches gives rise to a contradiction, then some match must be wrong and should be retracted. In addition, we would provide explanations to help users understand why a contradiction exists and why some matches are wrong. In the following, we assess the appropriateness of several existing spatial formalisms for these purposes. The Region Connection Calculus (RCC) (Randell et al., 1992) is a ﬁrst order formalism based on regions and the connection relation C, which is axiomatised to be reﬂexive and symmetric. Two regions x, y are connected (i.e. C(x, y) holds), if their closures share a point. Based on the connection relation, several spatial relations are deﬁned for regions. Among them, eight jointly exhaustive and pairwise disjoint (JEPD) relations are identiﬁed: DC (Disconnected), EC (Externally Connected), PO (Partially Overlap), TPP (Tangential Proper Part), NTPP (Non-Tangential Proper Part), TPPi (Inverse Tangential Proper Part), NTPPi (Inverse Non-Tangential Proper Part) and EQ (Equal). They are referred to as RCC8, which is well-known in the ﬁeld of qualitative spatial reasoning. The 9-intersection model is developed by Egenhofer and Franzosa (1991) and Egenhofer and Herring (1991) based on the point-set interpretation of geometries. By comparing the nine intersections between interiors, boundaries and exteriors of point-sets, it identiﬁes 29

mutually exclusive topological relations. The 9-intersection model provides a comprehensive formal categorization of binary topological relations between points, lines and regions. Only a small number of these 29 relations are realisable in a particular space (Egenhofer & Herring, 1991). Restricting point-sets to simple regions (regions homeomorphic to disks), the 512 relations collapse to the RCC8 relations. For the described application, we found it diﬃcult to use spatial formalisms such as the Region Connection Calculus (Randell et al., 1992) and the 9-intersection model (Egenhofer

Du & Alechina

& Franzosa, 1991), since they presuppose accurate geometries or regions with sharp boundaries and deﬁne spatial relations based on the connection relation. This is too strict for crowd-sourced geospatial data. As shown in Figure 2, a1 is same As a2, both representing a Prezzo Ristorante; b1 is same As b2, both referring to a Blue Bell Inn. Though the same As matches are correct, a topological inconsistency still exists, since a1 and b1 are disconnected (DC), a2 and b2 are externally connected (EC), and the spatial relations DC and EC are disjoint. Therefore, relations based on connection are too strict for crowd-sourced geospatial data which is possibly inaccurate and may contain errors.

Figure 2: In OSGB data, the Prezzo Ristorante (a1) and the Blue Bell Inn (b1) are disconnected, whilst in OSM data, they (a2 and b2) are externally connected.

The egg-yolk theory is independently developed by Lehmann and Cohn (1994), Cohn and Gotts (1996b, 1996a), and Roy and Stell (2001) extending the RCC theory and by Clementini and Felice (1996, 1997) extending the 9-intersection model, in order to represent and reason about regions with indeterminate boundaries. In this theory, a region with an indeterminate boundary (an indeterminate region) is represented by a pair of regions, an egg and a yolk , which are the maximum extension and the minimum extension of the indeterminate region respectively (similar to the upper approximation and lower approximation in the rough set theory, Pawlak, Polkowski, & Skowron, 2007). The yolk is not empty and it is always a proper part of the egg. The egg-yolk theory presupposes the existence of a core part of a region and a more vague part. For the described application, the same location can be represented using two disconnected polygons from an authoritative geospatial dataset and a crowd-sourced geospatial dataset respectively. In this case, we could not deﬁne a certain inner region for any of the disconnected polygons, otherwise, it is inconsistent to treat them as diﬀerent representations for the same location. We are aware that there are several approaches (Fine, 1975; Zadeh, 1975; Smith, 2008) to representing vague concepts and relations, which have been adopted to extend classical theories such as RCC and the 9-intersection model. A main approach is to assign a degree of truth or a degree of membership to concepts and relations. For example, in the fuzzy region connection calculus (fuzzy RCC) (Schockaert et al., 2008b, 2008a, 2009), the connection relation C is deﬁned as a reﬂexive and symmetric fuzzy relation. For regions a, b, C(a, b) denotes the degree to which a and b are connected. Using C as the primitive relation, every RCC relation R can be redeﬁned to calculate the degree to which R holds. The fuzzy RCC or other similar formalisms may be applied in the case shown in Figure 2. For example, with an appropriate membership function for the relation EC, EC(a1, b1) = 0.8

Qualitative Spatial Logics for Buffered Geometries

Figure 3: Buﬀering the geometry X by a distance σ; three dashed circles are buﬀered part of (BPT) the solid circle; the dashed circle and the solid circle are buﬀered equal (BEQ)

and EC(a2, b2) = 1, then no contradiction will arise. We did not adopt this approach in the matching problem, mainly because there is no good way to deﬁne the degree of membership, and it is diﬃcult to generate user-friendly explanations for why the matches are wrong if the underlying reasoning is numerical and relatively obscure. The logic MS(M) was proposed and developed by Sturm, Suzuki, Wolter, and Zakharyaschev (2000), Kutz, Sturm, Suzuki, Wolter, and Zakharyaschev (2002), Kutz, Wolter, Sturm, Suzuki, and Zakharyaschev (2003), Wolter and Zakharyaschev (2003, 2005), and Kutz (2007) for reasoning about distances. The logic MS(M) makes it possible to deﬁne concepts such as an object within the distance of 100 meters from a School . M in MS(M) is a parameter set. A typical example of M is Q 0. The satisﬁability problem for a ﬁnite set of MS(Q 0) formulas in a metric space is EXPTIME-complete (Wolter & Zakharyaschev, 2003). MS(M) was not developed for the problem of geospatial data matching. However, after we designed the logics introduced in this paper, we discovered that they form a proper fragment of MS(Q 0). To detect problematic matches, we also reason about distances between objects, but this reasoning is of a more restricted and qualitative kind. The complexity of the satisﬁability problem of our logics is NP-complete, which makes them somewhat more suitable for automatic debugging of matches than the full MS(Q 0). The syntax and semantics of MS(M) and the proofs for the proper fragment relations are provided later in this paper (see Section 3). In this paper, we present a series of new qualitative spatial logics developed for validating matches between spatial objects: a logic of NEAR and FAR for buﬀered points (LNF) (Du, Alechina, Stock, & Jackson, 2013), a logic of NEAR and FAR for buﬀered geometries (LNFS) and a logic of part and whole for buﬀered geometries (LBPT) (Du & Alechina, 2014a, 2014b). The notion of buﬀer (ISO Technical Committee 211, 2003) is used to model the uncertainty in geometry representations, tolerating slight diﬀerences up to a margin of error or a level of tolerance σ R 0. As shown in Figure 3, the buﬀer of a geometry X is a geometry which contains exactly all the points within σ distance from X. The buﬀer of X is denoted as buﬀer(X , σ). For a geometry X which is possibly represented inaccurately within the margin of error σ in one dataset, its corresponding representation in the other dataset is assumed to be somewhere within buﬀer(X , σ). The spatial logics involve four spatial relations Buﬀered Part Of (BPT), Buﬀered Equal (BEQ), NEAR and FAR. They formalize the notions of possibly part Of , possibly same As , possibly connected (given a possible displacement by σ) and deﬁnitely disconnected (even if displaced by σ) respectively. A geometry X is Buﬀered Part Of a geometry X , if X is within buﬀer(X , σ); two geometries are Buﬀered Equal, if they are Buﬀered Part Of

Du & Alechina

Figure 4: NEAR and FAR

each other (Figure 3). We assume that two geometries X and X from two diferent datasets may correspond to the same object if they are Buﬀered Equal. The parameter σ captures the margin of error in representation of geometries. Two geometries X, Y are NEAR, if the corresponding geometries X , Y in the other dataset could be connected, i.e. distance(X, Y ) [0, 2σ] (Figure 4). Clearly, if FAR(X , Y ) holds, then NEAR(X , Y ) should be false for X and Y from the same dataset. In addition, we want to exclude the possibility that NEAR(X , Y ) may hold for X , Y (corresponding to X, Y respectively) in the other dataset. Therefore we deﬁne FAR(X , Y ) as distance(X, Y ) (4σ, + ) (Figure 4). It is possible that two geometries X, Y are not NEAR and not FAR, this is, distance(X, Y ) (2σ, 4σ]. The way of deﬁning BEQ, NEAR and FAR is similar to that for deﬁning distance relations between points by Moratz and Wallgr un (2012), where each point is assigned one or more reference distances. The distance relations between two points X, Y are deﬁned by comparing the distance between X, Y to the reference distances of X and those of Y . As diﬀerent points can have diﬀerent reference distances for indicating nearness, distance relations may not be symmetric. Diﬀering from the work by Moratz and Wallgr un (2012), the relations we deﬁned are not only for points but also for general geometries, and every geometry has the same reference distances (σ, 2σ and 4σ), which leads to the symmetric deﬁnitions of BEQ, NEAR and FAR. We provide sound and complete sets of axioms to support reasoning about BEQ, BPT, NEAR and FAR relations (see Section 4). This reasoning is useful for verifying matches between spatial representations from diﬀerent sources. As explained in our previous work (Du et al., 2013), though the relations are named NEAR and FAR, we do not attempt to model human notions of nearness or proximity, which is inﬂuenced by several factors, such as absolute distance, relative distance, frame of reference, object size, travelling costs and reachability, travelling distance and attractiveness of objects (Guesgen & Albrecht, 2000). In this work, we provide a strict mathematical deﬁnition for the calculation of whether two objects are to be considered as being NEAR or FAR, based on a margin of error σ. While this makes our approach less likely to be suitable for the simulation of human notions of nearness, it provides a useful tool for verifying consistency of matches. The following arguments are formalized for checking consistency of same As and part Of matches: if spatial objects a1, b1 are same As or part Of spatial objects a2, b2 respectively, a1, b1 are NEAR, a2, b2 are FAR, then a contradiction exists. The rest of this paper is structured as follows. Section 2, Section 3 and Section 4 provide an introduction to the new spatial logics: their syntax and semantics, their relationships with the logic MS(M), their axioms and theorems. Section 5 and Section 6 present the proofs of the soundness, completeness, decidability and complexity theorems for LBPT, as the proofs for LNF and LNFS are similar and LBPT is more expressive than LNF and LNFS. Section 7 describes how LBPT is used for debugging matches between objects from

Qualitative Spatial Logics for Buffered Geometries

diﬀerent geospatial datasets. Section 8 discusses the generality and limitations of the spatial logics. Section 9 concludes the paper.

2. Syntax and Semantics

The language L(LNF) is deﬁned as

φ, ψ := BEQ(a, b) | NEAR(a, b) | FAR(a, b) | φ | φ ψ

where a, b are individual names. φ ψ def (φ ψ). The language L(LNFS) is exactly the same as L(LNF). The language L(LBPT) is almost the same as L(LNF) and L(LNFS), except that it has BPT instead of BEQ as a predicate. L(LBPT) is deﬁned as

φ, ψ := BPT(a, b) | NEAR(a, b) | FAR(a, b) | φ | φ ψ.

L(LNF), L(LNFS) and L(LBPT) are all interpreted over models based on a metric space. Every individual name involved in an LNF formula is mapped to a point, whilst each of those involved in an LNFS/LBPT formula is mapped to an arbitrary geometry or a non-empty set of points.

Deﬁnition 1 (Metric Space) A metric space is a pair ( , d), where is a non-empty set (of points) and d is a metric on , i.e. a function d : R 0, such that for any x, y, z , the following axioms are satisﬁed:

1. identity of indiscernibles: d(x, y) = 0 iﬀx = y;

2. symmetry: d(x, y) = d(y, x);

3. triangle inequality: d(x, z) d(x, y) + d(y, z).

Deﬁnition 2 (Metric Model of LNF) A metric model M of LNF is a tuple ( , d, I, σ), where ( , d) is a metric space, I is an interpretation function which maps each individual name to an element in , and σ R 0 is a margin of error. The notion of M |= φ (φ is true in the model M) is deﬁned as follows:

M |= BEQ(a, b) iﬀd(I (a), I (b)) [0, σ];

M |= NEAR(a, b) iﬀd(I (a), I (b)) [0, 2σ];

M |= FAR(a, b) iﬀd(I (a), I (b)) (4σ, );

M |= φ iﬀM |= φ;

M |= φ ψ iﬀM |= φ and M |= ψ,

where a, b are individual names, φ, ψ are formulas in L(LNF).

Deﬁnition 3 (Metric Model of LNFS/LBPT) A metric model M of LNFS/LBPT is a tuple ( , d, I, σ), where ( , d) is a metric space, I is an interpretation function which maps each individual name to a non-empty set of elements in , and σ R 0 is a margin of error. The notion of M |= φ (φ is true in the model M) is deﬁned as follows:

Du & Alechina

M |= BPT(a, b) iﬀ pa I (a) pb I (b) : d(pa, pb) [0, σ];

M |= NEAR(a, b) iﬀ pa I (a) pb I (b) : d(pa, pb) [0, 2σ];

M |= FAR(a, b) iﬀ pa I (a) pb I (b) : d(pa, pb) (4σ, );

M |= φ iﬀM |= φ;

M |= φ ψ iﬀM |= φ and M |= ψ,

where a, b are individual names, φ, ψ are formulas in L(LNFS)/L(LBPT). BEQ(a, b) is deﬁned as BPT(a, b) BPT(b, a).

The notions of validity and satisﬁability in metric models are standard. A formula is satisﬁable if it is true in some metric model. A formula φ is valid (|= φ) if it is true in all metric models (hence if its negation is not satisﬁable). The logic LNF/LNFS/LBPT is the set of all valid formulas in the language L(LNF)/L(LNFS)/L(LBPT) respectively.

3. Relationship with the logic MS(M)

The logic MS(M), as well as its variations, was developed by Sturm et al. (2000), Kutz et al. (2002, 2003), Wolter and Zakharyaschev (2003, 2005), and Kutz (2007) for reasoning about distances. MS(M) is a family of logics deﬁned relative to the parameter set M Q 0. M is subject to the following two conditions: if a, b M and a + b r, then a + b M, where r = sup M if M is bounded, otherwise r = ; if a, b M and a b > 0, then a b M. The alphabet of MS(M) consists of

an inﬁnite list of region variables X1, X2,...;

an inﬁnite list of location constants c1, c2,...;

a set constant {ci} for every location constant ci;

binary distance (δ), equality ( .=) and membership ( ) predicates;

the boolean operators , (and their derivatives , and );

two distance quantiﬁers <a, a and their duals <a, a, for every a M;

two universal quantiﬁers and .

MS(M) terms are deﬁned as:

s, t := Xi | {ci} | | | s | s t | <as | as | s.

In addition to standard description logic concept constructions, MS(M) can deﬁne a concept of objects which are at a distance less than a from instances of some other concept s: <as, and similarly for a distance at most a. <as and as are deﬁned as <a( s) and a( s) respectively.

Qualitative Spatial Logics for Buffered Geometries

MS(M) formulas are deﬁned as

φ, ψ := c s | s .= t | δ(c1, c2) < a | δ(c1, c2) a | φ | φ ψ.

Further, s t is an abbreviation for (s t) .= s and s .= t is an abbreviation for (s .= t). δ(c1, c2) > a and δ(c1, c2) a are deﬁned as (δ(c1, c2) a) and (δ(c1, c2) < a) respectively. An MS(M)-model B is a structure of the form:

B = W, d, XB 1 , XB 2 , ..., c B 1 , c B 2 , ...

where W, d is a metric space (Deﬁnition 1), each XB i is a subset of W, and each c B i is an element of W. The value of any other MS(M)-term in B is computed inductively as follows:

B = W, B = ;

{ci}B = {c B i };

( s)B = W s B;

(s1 s2)B = s B 1 s B 2 ;

( <as)B = {x W | y s B : d(x, y) < a};

( as)B = {x W | y s B : d(x, y) a};

( s)B = {x W | y s B}.

<a, a and are dual to <a, a and respectively. For instance,

( <as)B = {x W | y W : (d(x, y) < a y s B)}.

The truth condition of B |= φ, where φ is an MS(M)-formula, is deﬁned as follows:

B |= c s iﬀc B s B;

B |= s1 .= s2 iﬀs B 1 = s B 2 ;

B |= δ(k, l) < a iﬀd(k B, l B) < a;

B |= δ(k, l) a iﬀd(k B, l B) a;

B |= φ iﬀB |= φ;

B |= φ ψ iﬀB |= φ and B |= ψ.

A set of MS(M) formulas Σ is satisﬁable, if there exists an MS(M)-model B such that B |= φ for every φ Σ. This is denoted as B |= Σ. It is proved below that LNF/LNFS/LBPT are proper fragments of the logic MS(Q 0). Strictly speaking, this only holds when σ Q 0, but later we will show that a ﬁnite set of LNF/LNFS/LBPT formulas is satisﬁable where σ R 0, if it is satisﬁable when σ = 1. In other words, σ acts as a scaling factor (see the proof of Lemma 43).

Du & Alechina

Lemma 1 For individual names a, b, the MS(M) formula {a} {b} is not expressible in LNF.

Proof. Let M1, M2 be metric models2. M1 = ( 1, d, I1, σ), M2 = ( 2, d, I2, σ). In M1, 1 = {o1, o2}, d(o1, o2) = σ. I1(a) = o1, I1(b) = o2. For any individual name x diﬀering from a, b, I1(x) = o1. In M2, 2 = {o}. I2(a) = o, I2(b) = o. For any individual name x diﬀering from a, b, I2(x) = o. For any individual name y, Ii({y}) = {Ii(y)}, i {1, 2}. By the deﬁnitions of M1, M2, for any individual names x, y, d(I1(x), I1(y)) [0, σ], d(I2(x), I2(y)) = 0. If φ is an atomic LNF formula about x, y, then by Deﬁnition 2, M1 |= φ iﬀM2 |= φ. By an easy induction on logical connectives, for any LNF formula φ, M1 |= φ iﬀM2 |= φ. Since I1({a}) = {o1}, I1({b}) = {o2} and I2({a}) = I2({b}) = {o}, by the truth deﬁnition of MS(M) formulas, M1 |= ({a} {b}), M2 |= ({a} {b}). Hence, {a} {b} is not equivalent to any LNF formula.

Lemma 2 The logic LNF is a proper fragment of the logic MS(Q 0).

Proof. Every atomic LNF formula is expressible in MS(Q 0):

BEQ(a, b) (δ(a, b) 0) (δ(a, b) σ);

NEAR(a, b) (δ(a, b) 0) (δ(a, b) 2σ);

FAR(a, b) (δ(a, b) > 4σ).

This means that all LNF formulas can be expressed in a fragment of MS(Q 0) (the image of LNF under the translation above) which only contains location constants, binary distance predicate and boolean connectives , . By Lemma 1, LNF is a proper fragment of MS(M).

Lemma 3 For individual names a, b, the MS(M) formula a b is not expressible in LNFS/LBPT.

Proof. Let M1, M2 be metric models3. M1 = ( 1, d, I1, σ), M2 = ( 2, d, I2, σ). In M1, 1 = {o1, o2}, d(o1, o2) = σ. I1(a) = {o1}, I1(b) = {o2}. For any individual name x diﬀering from a, b, I1(x) = {o1}. In M2, 2 = {o}. I2(a) = {o}, I2(b) = {o}. For any individual name x diﬀering from a, b, I2(x) = {o}. If φ is an atomic LNFS/LBPT formula about x, y, then by Deﬁnition 3, M1 |= φ iﬀ M2 |= φ. By an easy induction on logical connectives, for any LNFS/LBPT formula φ, M1 |= φ iﬀM2 |= φ.

2. Note that we can construct models in a one-dimensional or two-dimensional Euclidean space in a similar way and prove the lemma. 3. Note that we can construct models in a one-dimensional or two-dimensional Euclidean space in a similar way and prove the lemma.

Qualitative Spatial Logics for Buffered Geometries

By the truth deﬁnition of MS(M) formulas, M1 |= (a b) and M2 |= (a b). Hence, a b is not equivalent to any LNFS/LBPT formula.

Lemma 4 The logic LNFS/LBPT is a proper fragment of MS(Q 0).

Proof. Every atomic LNFS/LBPT formula is expressible in MS(Q 0):

(For LNFS) BEQ(a, b) iﬀ(a ( σb)) (b ( σa));

(For LBPT) BPT(a, b) iﬀ(a ( σb));

NEAR(a, b) iﬀ(a ( 2σb) .= );

FAR(a, b) iﬀ(a ( 4σb) .= ).

Note that the formulas on the right belong to a fragment of MS(Q 0) which is the image of LNFS/LBPT under the translation above. The correctness of translation of BEQ(a, b) and BPT(a, b) into MS(Q 0) follows directly from the truth deﬁnition of BEQ and BPT (Deﬁnition 3). To show that the translation of NEAR and FAR are correct, consider that the truth deﬁnition of NEAR(a, b) is equivalent to 0 dmin(a, b) 2σ and FAR(a, b) to dmin(a, b) > 4σ, where dmin(a, b) = inf{d(pa, pb) | pa I(a), pb I(b)}. It was shown by Wolter and Zakharyaschev (2005) that dmin(a, b) m iﬀa ( mb) .= . This makes the translation of the formulas have the same truth conditions as deﬁned in Deﬁnition 3. By Lemma 3, LNFS/LBPT is a proper fragment of MS(Q 0).

Wolter and Zakharyaschev (2003) proved that the satisﬁability problem for a ﬁnite set of MS(Q 0) formulas in a metric space is EXPTIME-complete, which provides an upper bound on the complexity of the satisﬁability problems of LNF, LNFS and LBPT in a metric space. Kutz et al. (2002) and Kutz (2007) gave axioms or inference rules connecting MS(M) terms (e.g. 0s s) for MS(M) and its variants. However, the axiomatisation we are going to present is for LNF, LNFS and LBPT formulas (corresponding to MS(M) formulas rather than MS(M) terms).

4. Axioms and Theorems

This section presents a sound and complete axiomatisation for the logic LNF/LNFS/LBPT respectively. The axiomatic systems have been used as a basis for a rule-based reasoner described later in Section 7 4. The following calculus (which we will also refer to as LNF) is sound and complete for LNF:

Axiom 0 All tautologies of classical propositional logic

4. It is important to have a complete axiomatisation. Otherwise, the reasoner can not detect all the LNF/LNFS/LBPT inconsistencies caused by problematic matches.

Du & Alechina

Axiom 1 BEQ(a, a);

Axiom 2 BEQ(a, b) BEQ(b, a);

Axiom 3 NEAR(a, b) NEAR(b, a);

Axiom 4 FAR(a, b) FAR(b, a);

Axiom 5 BEQ(a, b) BEQ(b, c) NEAR(c, a);

Axiom 6 BEQ(a, b) NEAR(b, c) BEQ(c, d) FAR(d, a);

Axiom 7 NEAR(a, b) NEAR(b, c) FAR(c, a);

MP Modus ponens: φ, φ ψ ψ.

The following calculus (which we will also refer to as LNFS) is sound and complete for LNFS:

Axiom 0 All tautologies of classical propositional logic

Axiom 1 BEQ(a, a);

Axiom 2 BEQ(a, b) BEQ(b, a);

Axiom 3 NEAR(a, b) NEAR(b, a);

Axiom 4 FAR(a, b) FAR(b, a);

Axiom 5 BEQ(a, b) BEQ(b, c) NEAR(c, a);

Axiom 6 BEQ(a, b) NEAR(b, c) BEQ(c, d) FAR(d, a);

Axiom 8 NEAR(a, b) BEQ(b, c) BEQ(c, d) FAR(d, a);

MP Modus ponens: φ, φ ψ ψ.

Axiom 7 of the calculus LNF only holds for points, but not for arbitrary geometries, because two points within the same line or polygon can be far from each other. Axiom 7 is replaced by Axiom 8 in LNFS. All other axioms in LNFS are the same as those in LNF. The following calculus (which we will also refer to as LBPT) is sound and complete for LBPT:

Axiom 0 All tautologies of classical propositional logic

Axiom 3 NEAR(a, b) NEAR(b, a);

Axiom 4 FAR(a, b) FAR(b, a);

Axiom 9 BPT(a, a);

Axiom 10 BPT(a, b) BPT(b, c) NEAR(c, a);

Qualitative Spatial Logics for Buffered Geometries

Axiom 11 BPT(b, a) BPT(b, c) NEAR(c, a);

Axiom 12 BPT(b, a) NEAR(b, c) BPT(c, d) FAR(d, a);

Axiom 13 NEAR(a, b) BPT(b, c) BPT(c, d) FAR(d, a);

MP Modus ponens: φ, φ ψ ψ.

The calculus LBPT is similar to the calculus LNFS. Changing predicates from BEQ to BPT, LNFS Axioms 1, 6, 8 are replaced by Axioms 9, 12, 13 respectively in LBPT. Since BPT is not symmetric, LNFS Axiom 2 does not have a corresponding axiom in LBPT, and LNFS Axiom 5 is replaced by two LBPT axioms, Axiom 10 and Axiom 11. The notion of derivability Γ φ in LNF/LNFS/LBPT calculus is standard. A formula φ is derivable if φ. A set Γ is LNF/LNFS/LBPT-inconsistent if for some formula φ it derives both φ and φ. We proved the following theorems for LNF, LNFS and LBPT.

Theorem 1 (Soundness and Completeness) The LNF/LNFS/LBPT calculus is sound and complete for metric models, namely that

(every derivable formula is valid and every valid formula is derivable).

Theorem 2 (Decidability and Complexity) The satisﬁability problem for a ﬁnite set of LNF/LNFS/LBPT formulas in a metric space is NP-complete.

In the following sections, we give proofs of the results above for the case of LBPT. The proofs for LNF and LNFS are similar. For LBPT, we have the following derivable formulas, which we will refer to as facts in the completeness proof:

Fact 14 BPT(a, b) NEAR(a, b);

Fact 15 NEAR(a, b) FAR(a, b);

Fact 16 NEAR(a, b) BPT(b, c) FAR(c, a);

Fact 17 BPT(a, b) FAR(a, b);

Fact 18 BPT(a, b) BPT(b, c) FAR(c, a);

Fact 19 BPT(b, a) BPT(b, c) FAR(c, a);

Fact 20 BPT(a, b) BPT(b, c) BPT(c, d) FAR(d, a);

Fact 21 BPT(b, a) BPT(b, c) BPT(c, d) FAR(d, a);

Fact 22 BPT(a, b) BPT(b, c) BPT(c, d) BPT(d, e) FAR(e, a);

Fact 23 BPT(b, a) BPT(b, c) BPT(c, d) BPT(d, e) FAR(e, a);

Fact 24 BPT(b, a) BPT(c, b) BPT(c, d) BPT(d, e) FAR(e, a).

As shown by Facts 17-24, a chain of at most four BPTs implies the negation of FAR, because FAR is deﬁned as being > 4σ distance away in Deﬁnition 3.

Du & Alechina

5. Soundness and Completeness of LBPT

This section shows that the LBPT calculus is sound and complete for metric models. Though several deﬁnitions and lemmas have been presented in our previous work (Du et al., 2013; Du & Alechina, 2014b), the proofs presented here are more complete, structured, accurate (small errors are corrected) and simpliﬁed. The proof of soundness (every LBPT derivable formula is valid: φ |= φ) is by an easy induction on the length of the derivation of φ. Axioms 3, 4, 9-13 are valid (by the truth deﬁnition of BPT, NEAR and FAR) and modus ponens preserves validity. In the rest of this section, we prove completeness (every LBPT valid formula is derivable): |= φ φ

We will actually prove that if a ﬁnite set of LBPT formulas Σ is consistent, then there is a metric model satisfying it. Any ﬁnite set of formulas Σ can be rewritten as a formula ψ which is the conjunction of all formulas in Σ. Σ is consistent, iﬀψ is consistent ( ψ). If there is a metric model M satisfying Σ, then M satisﬁes ψ, thus |= ψ. Therefore, if we show that if Σ is consistent, then there exists a metric model satisfying it, then we show that if ψ, then |= ψ. This shows that φ |= φ and by contraposition we get completeness. The completeness theorem is proved by constructing a metric model for a maximal consistent set (Deﬁnition 4) of any ﬁnite consistent set of LBPT formulas (Lemma 5).

Deﬁnition 4 (MCS) A set of formulas Γ in the language L(LBPT) is maximal consistent, if Γ is consistent, and any set of LBPT formulas over the same set of individual names properly containing Γ is inconsistent. If Γ is a maximal consistent set of formulas, then we call it an MCS.

Proposition 1 (Properties of MCSs) If Γ is an MCS, then,

Γ is closed under modus ponens: if φ, φ ψ Γ, then ψ Γ;

if φ is derivable, then φ Γ;

for all formulas φ: φ Γ or φ Γ;

for all formulas φ, ψ: φ ψ Γ iﬀφ Γ and ψ Γ;

for all formulas φ, ψ: φ ψ Γ iﬀφ Γ or ψ Γ.

Lemma 5 (Lindenbaum s Lemma) If Σ is a consistent set of formulas in the language L(LBPT), then there is an MCS Σ+ over the same set of individual names such that Σ Σ+.

Let φ0, φ1, φ2, ... be an enumeration of LBPT formulas over the same set of individual names as that in Σ. Σ+ can be deﬁned as follows:

Σn+1 = Σn {φn}, if it is consistent, otherwise, Σn+1 = Σn { φn};

Qualitative Spatial Logics for Buffered Geometries

Σ+ = S n 0 Σn.

For a ﬁnite consistent set of formulas Σ, we construct a metric model satisfying a maximal consistent set Σ+, which contains Σ and is over the same set of individual names as Σ, as follows. Firstly, we equivalently transform Σ+ to B(Σ+), which is a set of basic quantiﬁed formulas. Then we construct a set of distance constraints D(Σ+) from B(Σ+). A key concept here is path-consistency for a set of distance constraints.

Deﬁnition 5 (Non-negative Interval) An interval h is non-negative, if h [0, + ).

Deﬁnition 6 (Distance Constraint, Distance Range) A distance constraint is a statement of the form d(p, q) g, where p, q are constants representing points, d(p, q) stands for the distance between p, q, and g is a non-negative interval, which stands for the distance range for p, q.

Deﬁnition 7 (Composition) If d1, d2 are non-negative real numbers, then the composition of {d1} and {d2} is deﬁned as: {d1} {d2} = [|d1 d2|, d1 + d2] 5. If g1, g2 are nonnegative intervals, then their composition is an interval which is the union of all {d1} {d2}, where d1 g1, d2 g2, this is,

g1 g2 = S d1 g1,d2 g2{d1} {d2}.

It is assumed that a set of distance constraints D contains at most one distance range for each pair of constants p, q involved in D, and D is closed under symmetry, i.e. if d(p, q) g is in D, then d(q, p) g is in D.

Deﬁnition 8 (Path-Consistency) For a set of distance constraints D, for every pair of diﬀerent constants p, q involved in D, their distance range is strengthened by successively applying the following operation until a ﬁxed point is reached:

s : g(p, q) g(p, q) (g(p, s) g(s, q))

where s is a constant in D, s = p, s = q, and g(p, q) denotes the distance range for p, q. This process is called enforcing path-consistency on D. If at a ﬁxed point, for every pair of constants p, q, g(p, q) = , then D is called path-consistent.

In this paper, we say an interval is referred to in the process of enforcing path-consistency on D, if it occurs in D or is involved in the enforcement of the operation g(p, q) g(p, q) (g(p, s) g(s, q)). In other words, it is used as g(p, q), g(p, s) or g(s, q). A distance constraint appears in the process of enforcing path-consistency on D, if its distance range (an interval) is referred to in the process of enforcing path-consistency on D. The way of enforcing path-consistency on a set of distance constraints deﬁned above is almost the same as that of enforcing path-consistency on a binary constraint satisfaction problem (CSP) (Renz & Nebel, 2007; van Beek, 1992), except that the operation s : g(p, q) g(p, q) (g(p, s) g(s, q)) ( is the composition operator for non-negative intervals, Deﬁnition 7) is applied instead of k : Rij Rij (Rik Rkj) ( is the composition operator

5. Based on d(x, z) d(x, y) + d(y, z) (Property 3 of Deﬁnition 1).

Du & Alechina

for relations). The time complexity of the path-consistency algorithm for CSP is O(n3) (van Beek, 1992; Mackworth & Freuder, 1985), where n is the number of variables involved in the input set of binary constraints. The path-consistency algorithm for CSP can be adapted easily for enforcing path-consistency on a set of distance constraints. The time complexity of the resulting path-consistency algorithm is also O(n3), where n is the number of constants involved in the input set of distance constraints. Later in this paper, we will show that the process of enforcing path-consistency on D(Σ+) terminates, and a ﬁxed point can be reached in O(n3) (see Lemma 33). After constructing a set of distance constraints D(Σ+) from Σ+, we prove the Metric Model Lemma, Metric Space Lemma and Path-Consistency Lemma which are stated below. The notion of path-consistency acts as a bridge between the lemmas.

Lemma 6 (Metric Model Lemma) Let Σ be a ﬁnite consistent set of formulas, and Σ+

be an MCS which contains Σ and is over the same set of individual names as Σ. If a metric space satisﬁes D(Σ+), then it can be extended to a metric model satisfying Σ+.

Lemma 7 (Metric Space Lemma) Let Σ be a ﬁnite consistent set of formulas, and Σ+

be an MCS which contains Σ and is over the same set of individual names as Σ. If D(Σ+) is path-consistent, then there is a metric space ( , d) such that all distance constraints in D(Σ+) are satisﬁed.

Lemma 8 (Path-Consistency Lemma) Let Σ be a ﬁnite consistent set of formulas, and Σ+ be an MCS which contains Σ and is over the same set of individual names as Σ. Then, D(Σ+) is path-consistent.

Using these three lemmas, we prove the completeness of LBPT: if a ﬁnite set of formulas Σ is LBPT-consistent, then there exists a metric model satisfying it.

Proof. If Σ is LBPT-consistent, by the Lindenbaum s Lemma (Lemma 5), there is an MCS Σ+ over the same set of individual names such that Σ Σ+. By the Path-Consistency Lemma (Lemma 8) and the Metric Space Lemma (Lemma 7), there is a metric space ( , d) such that all distance constraints in D(Σ+) are satisﬁed. By the Metric Model Lemma (Lemma 6), the metric space can be extended to a metric model M satisfying Σ+. Since Σ Σ+, M satisﬁes Σ.

The detailed proofs of the Metric Model Lemma, Metric Space Lemma and Path Consistency Lemma are provided in Section 5.1, Section 5.2 and Section 5.3 respectively. Note that, in this paper, Σ+ denotes an MCS which contains a given ﬁnite consistent set of formulas Σ and is over the same set of individual names as Σ.

5.1 Metric Model Lemma

This section shows how to construct a set of distance constraints D(Σ+) from Σ+, and presents the proof of the Metric Model Lemma. By the deﬁnition and properties of MCSs (Deﬁnition 4 and Proposition 1), the following lemma holds.

Qualitative Spatial Logics for Buffered Geometries

Lemma 9 If Σ+ is an MCS, then for any pair of individual names a, b occurring in Σ, exactly one of the following cases holds in Σ+:

1. case(a, b) = BPT(a, b) BPT(b, a);

2. case(a, b) = BPT(a, b) BPT(b, a);

3. case(a, b) = BPT(a, b) BPT(b, a);

4. case(a, b) = BPT(a, b) BPT(b, a) NEAR(a, b);

5. case(a, b) = NEAR(a, b) FAR(a, b);

6. case(a, b) = FAR(a, b),

where case(a, b) denotes the formula which holds between a, b in each case.

Lemma 9 is proved using LBPT axioms and facts (such as Axiom 3, Facts 14, 15) in the same way as proving the lemma for LNF (see Du et al., 2013). The full proof of Lemma 9 is provided in Appendix A. The construction of a set of distance constraints D(Σ+) from Σ+ has two main steps:

Step 1 For every pair of individual names a, b occurring in Σ, we translate case(a, b) into a set of ﬁrst order formulas which is equi-satisﬁable to case(a, b). The union of all such sets of ﬁrst order formulas is B(Σ+) (hence, B(Σ+) and Σ are equi-satisﬁable.). This step is described by Deﬁnition 9 and Deﬁnition 10.

Step 2 We construct a set of distance constraints D(Σ+) from B(Σ+). This step is described by Deﬁnitions 11-13.

For LBPT formulas, there are ﬁrst order formulas corresponding to their truth deﬁnition in Deﬁnition 3. We use formulas of the form d(p, q) g as abbreviations of their equivalent ﬁrst order formulas. For example, d(p, q) [0, σ] abbreviates d(p, q) 0 d(p, q) σ. Observe that6

BPT(a, b) and pa a pb b : d(pa, pb) [0, σ] are equi-satisﬁable ;

NEAR(a, b) and pa a pb b : d(pa, pb) [0, 2σ] are equi-satisﬁable;

FAR(a, b) and pa a pb b : d(pa, pb) (4σ, ) are equi-satisﬁable.

Deﬁnition 9 (Basic Quantiﬁed Formula) We refer to the ﬁrst order formulas of the following forms as basic quantiﬁed formulas:

pa a pb b : d(pa, pb) g;

pa a pb b : d(pa, pb) g;

6. Note that by pa a pb b : d(pa, pb) [0, σ], we are actually quantifying over a metric space. In such sense, it is more precise to say, for example, BPT(a, b) is satisﬁable in a metric model, iﬀ pa a pb b : d(pa, pb) [0, σ] is satisﬁable over a metric space.

Du & Alechina

pa a pb b : d(pa, pb) g;

pa a pb b : d(pa, pb) g,

where g is a non-negative interval. The abbreviations of these four forms are deﬁned as (a, b, g), (a, b, g), χ(a, b, g) and ξ(a, b, g) respectively. In other words, for example, (a, b, g) ( pa a pb b : d(pa, pb) g).

Now we translate the formula in each case listed in Lemma 9 into basic quantiﬁed formulas, which will be used to count the number of points needed for interpreting individual names occurring in Σ later.

Deﬁnition 10 (B(Σ+)) For an MCS Σ+ over the same set of individual names as Σ, its corresponding set of basic quantiﬁed formulas B(Σ+) is constructed as follows. For every pair of individual names a, b, we translate case(a, b) into basic quantiﬁed formulas:

translate(BPT(a, b) BPT(b, a)) = {χ(a, b, [0, σ]), χ(b, a, [0, σ])};

translate(BPT(a, b) BPT(b, a)) = {χ(a, b, [0, σ]), ξ(b, a, (σ, ))};

translate( BPT(a, b) BPT(b, a)) = {ξ(a, b, (σ, )), χ(b, a, [0, σ])};

translate( BPT(a, b) BPT(b, a) NEAR(a, b)) = {ξ(a, b, (σ, )), ξ(b, a, (σ, )), (a, b, [0, 2σ]), (b, a, [0, 2σ])};

translate( NEAR(a, b) FAR(a, b)) = { (a, b, (2σ, )), (b, a, (2σ, )), (a, b, [0, 4σ]), (b, a, [0, 4σ])};

translate(FAR(a, b)) = { (a, b, (4σ, )), (b, a, (4σ, ))},

where σ R 0 is a ﬁxed margin of error. Let names(Σ) be the set of individual names occurring in Σ. Then,

B(Σ+) = S a names(Σ),b names(Σ) translate(case(a, b)).

In the following, for a set of basic quantiﬁed formulas B(Σ+), we construct a set of distance constraints D(Σ+), and then show that if there is a metric space satisfying D(Σ+), then it can be extended to a model of Σ+. In other words, we are constructing a metric over a set of points used to interpret individual names. The number of points needed for interpreting each individual name depends on the numbers of diﬀerent forms of formulas in B(Σ+). For any individual name a, let us predict how many particular constants in points(a) (points assigned to an individual name a) can be speciﬁed by the ﬁnite set of formulas about a in B(Σ+). points(a) contains at least one constant. If a formula in B(Σ+) says there exists a constant in points(a) , then this constant is a particular constant within points(a). For any pair of diﬀerent individual names a, b, if both (a, b, g) and (b, a, g) are in B(Σ+), we only count one of them; if χ(a, b, g) is in B(Σ+), we map all the constants in points(a) to the same constant in points(b). By Lemma 9 and Deﬁnition 10, in B(Σ+), for any pair of diﬀerent individual names a, b and R { , ξ, χ} we never have R(a, b, g1) and R(a, b, g2), where g1 = g2, at the same time. The cardinality of points(a) is speciﬁed as follows.

Qualitative Spatial Logics for Buffered Geometries

Deﬁnition 11 (num(a, B(Σ+)) and points(a)) Let names(Σ) be the set of individual names occurring in Σ 7. For any individual name a names(Σ),

num( a, B(Σ +)) = |{b names(Σ) | g : (a, b, g) B(Σ +)}| num(ξa, B(Σ +)) = |{b names(Σ) | g : ξ(a, b, g) B(Σ +)}| num(χa, B(Σ +)) = |{b names(Σ) | g : χ(b, a, g) B(Σ +)}|

Then num(a, B(Σ +)) = num( a, B(Σ +)) + num(ξa, B(Σ +)) + num(χa, B(Σ +)). points(a) is a set of constants {p1 a, . . . , pn a}, where n = num(a, B(Σ+)).

Deﬁnition 12 (Witness for a formula) A witness for a formula (a, b, g) is a pair of constants pa points(a), pb points(b) such that d(pa, pb) g. A witness for a formula ξ(a, b, g) or χ(b, a, g) is a constant pa points(a), such that for any constant pb points(b), d(pa, pb) g. A constant is clean for a formula, if it is not a witness for any other formula.

Deﬁnition 13 (D(Σ+)) Let B(Σ+) be the corresponding set of basic quantiﬁed formulas of an MCS Σ+. For every individual name a in Σ, we assign a ﬁxed set of new constants points(a) to it. We construct a set of distance constraints D(Σ+) as follows, by iterating through the basic quantiﬁed formulas in B(Σ+) and eliminating quantiﬁers on new constants. Initially, D(Σ+) = {}. For every individual name a in Σ, for every constant pa points(a), we add d(pa, pa) {0} to D(Σ+). For every pair of diﬀerent individual names a, b, if

(a, b, g) B(Σ +), then we take clean constants pa points(a), pb points(b), and add d(pa, pb) = d(pb, pa) g to D(Σ+), so pa, pb become a witness for (a, b, g);

ξ(a, b, g) B(Σ +), then we take a clean constant pa points(a), for every pb points(b), we add d(pa, pb) = d(pb, pa) g to D(Σ+), so pa becomes a witness for ξ(a, b, g);

ξ(b, a, g) B(Σ +), then we take a clean constant pb points(b), for every pa points(a), we add d(pa, pb) = d(pb, pa) g to D(Σ+), so pb becomes a witness for ξ(b, a, g);

χ(a, b, g) B(Σ +), then we take a clean constant pb points(b), for every pa points(a), we add d(pa, pb) = d(pb, pa) g to D(Σ+), so pb becomes a witness for χ(a, b, g);

χ(b, a, g) B(Σ +), then we take a clean constant pa points(a), for every pb points(b), we add d(pa, pb) = d(pb, pa) g to D(Σ+), so pa becomes a witness for χ(b, a, g);

(a, b, g) B(Σ +), then for every pair of constants pa points(a), pb points(b), we add d(pa, pb) = d(pb, pa) g to D(Σ+).

For every pair of diﬀerent constants p, q in D(Σ+), we add d(p, q) = d(q, p) [0, ) to D(Σ+), then repeatedly replace d(p, q) = d(q, p) g1 and d(p, q) = d(q, p) g2 with d(p, q) = d(q, p) (g1 g2), until there is only one distance range for each pair of p, q in D(Σ+).

7. By the deﬁnition of Σ+, Σ+ contains the same set of individual names as Σ.

Du & Alechina

In Deﬁnition 13, for every pair of diﬀerent individual names a, b, we check whether χ(a, b, g) B(Σ +) holds and check whether χ(b, a, g) B(Σ +) holds, because it is possible only one of them holds. For the same reason, we check ξ(a, b, g) B(Σ +) and ξ(b, a, g) B(Σ +) separately. By Deﬁnition 10, (a, b, g) B(Σ +) iﬀ (b, a, g) B(Σ +). Hence we only need to check any one of them. We only check whether (a, b, g) B(Σ +) holds, as (a, b, g) B(Σ +) iﬀ (b, a, g) B(Σ +).

Lemma 10 When constructing D(Σ+), for any individual name a, the number of clean constants needed from points(a) is no larger than num(a, B(Σ+)).

Proof. By Deﬁnition 10, for any individual name a, χ(a, a, [0, σ]) is in B(Σ+). By Deﬁnition 11, num(a, B(Σ+)) 1. If a is not involved in any formula of the form (a, b, g), ξ(a, b, g) or χ(b, a, g), for any other individual name b, then by Deﬁnition 11, num(a, B(Σ+)) = 1. By Deﬁnition 13, we need no clean constants from points(a). Otherwise, by Lemma 9 and Deﬁnition 10, in B(Σ+), for any pair of diﬀerent individual names a, b and R { , ξ, χ}, we never have R(a, b, g1) and R(a, b, g2), where g1 = g2, at the same time. By Deﬁnition 13, for each (a, b, g) B(Σ+), we take one clean constant from points(a), so num( a, B(Σ+)) clean constants are needed in total for all formulas of this form. Similarly, num(ξa, B(Σ+)) and (num(χa, B(Σ+)) 1) clean constants are needed for formulas of forms ξ(a, b, g) and χ(b, a, g) respectively, where a, b are diﬀerent individual names. We do not need any other clean constant from points(a) for formulas in other forms. By Deﬁnition 11, num(a, B(Σ+)) is enough.

D(Σ+) and B(Σ+) are not equi-satisﬁable because of the way we assign witnesses for χ formulas. More speciﬁcally, for any pair of diﬀerent individual names a, b, if χ(a, b, g) is in B(Σ+), we map all the constants in points(a) to the same constant in points(b). In other words, if B (Σ+) is the set of formulas resulting from replacing every χ(a, b, g) in B(Σ+) with ξ(b, a, g), then D(Σ+) and B (Σ+) are equi-satisﬁable. Since every individual name is interpreted as a non-empty set of constants, if a model satisﬁes ξ(b, a, g), then it satisﬁes χ(a, b, g), but not vice versa. Hence, constructing D(Σ+) for B (Σ+) rather than B(Σ+) imposes stronger restrictions (i.e. χ(a, b, g) in B(Σ+) is replaced with ξ(b, a, g) in B (Σ+)) on the metric space compared to that required by Σ+. However, later we will show that if Σ+ is consistent, then D(Σ+) can be satisﬁed in a metric space by proving the Metric Space Lemma and Path-Consistency Lemma in the following sections. Before proving the Metric Model Lemma, let us look at some important properties of D(Σ+), as shown by Lemmas 11-13. The proof of Lemma 11 is provided in Appendix A. Lemma 12 follows from the proof of Lemma 11.

Lemma 11 For any distance range g occurring in D(Σ+),

g {{0}, [0, σ], (σ, ), [0, 2σ], (2σ, ), (2σ, 4σ], (4σ, ), [0, )}.

Lemma 12 If p points(a), q points(b), and a = b, then d(p, q) {0} is not in D(Σ+).

Lemma 13 The number of constants in D(Σ+) is ﬁnite.

Qualitative Spatial Logics for Buffered Geometries

Proof. It is assumed that Σ is a ﬁnite consistent set of formulas over n (a ﬁnite number) individual names. By Lemma 9 and Deﬁnition 10, B(Σ+) contains at most f = (n + 2n(n 1)) formulas over n individual names. By Deﬁnition 11, for any individual name a, num(a, B(Σ+)) f. By Deﬁnition 13, the number of constants in D(Σ+) is at most nf.

The Metric Model Lemma is proved as follows.

Lemma 14 If a metric model satisﬁes B(Σ+), then it satisﬁes Σ+.

Proof. The lemma follows from two observations. First, by Lemma 9, Σ+ is entailed by the set C(Σ+) = {case(a, b) : a names(Σ+), b names(Σ+)}. Second, by Deﬁnition 10, B(Σ+) is a translation of truth conditions of C(Σ+) into ﬁrst order logic. If a metric model satisﬁes B(Σ+), then it satisﬁes C(Σ+), and hence it satisﬁes Σ+.

Lemma 6 (Metric Model Lemma) Let Σ be a ﬁnite consistent set of formulas, and Σ+ be an MCS which contains Σ and is over the same set of individual names as Σ. If a metric space satisﬁes D(Σ+), then it can be extended to a metric model satisfying Σ+.

Proof. Suppose a metric space satisﬁes D(Σ+). We extend it to a metric model M by interpreting every individual name a occurring in Σ+ as points(a), a s corresponding set of constants of size num(a, B(Σ+)) (Deﬁnition 11 and Deﬁnition 13). By Deﬁnition 13, any formula of the form χ(a, a, [0, σ]) is satisﬁed by M. For any pair of diﬀerent individual names, every , ξ or χ formula has a witness, and all formulas are also satisﬁed by M. Therefore, M is a metric model of B(Σ+). By Lemma 14, M is a metric model of Σ+.

5.2 Metric Space Lemma

As the process of enforcing path-consistency (Deﬁnition 8) involves the application of the composition operator (Deﬁnition 7), we present several lemmas in Section 5.2.1 to demonstrate the main calculation rules of and the properties of intervals obtained from composition. In Section 5.2.2, we characterize distance constraints in D(Σ+) and those appearing in the process of enforcing path-consistency on D(Σ+). Using the deﬁnitions and lemmas introduced in Section 5.2.1 and Section 5.2.2, the Metric Space Lemma is proved in Section 5.2.3.

5.2.1 The Composition Operator

In this section, we present several lemmas to show the main calculation rules of the composition operator and the properties of intervals obtained from composition. These lemmas are important for understanding several proofs in later sections. Lemmas 15-16 follow from Deﬁnition 7.

Lemma 15 Let g1, g2 be non-negative intervals. If d3 g1 g2, then there exist d1 g1, d2 g2 such that d3 [|d1 d2|, d1 + d2].

Du & Alechina

Lemma 16 (Calculation of Composition) If (m, n), (s, t), (m, ), (s, ), {l}, {r} are non-negative non-empty intervals, H1, H2, H are non-negative intervals, then the following calculation rules hold:

1. {l} {r} = [l r, l + r], if l r;

2. {l} (s, t) = (s l, t + l), if s l;

3. {l} (s, t) = [0, t + l), if l (s, t);

4. {l} (s, t) = (l t, t + l), if t l;

5. {l} (s, + ) = (s l, + ), if s l;

6. {l} (s, + ) = [0, + ), if s < l;

7. (m, n) (s, t) = (s n, t + n), if s n;

8. (m, n) (s, t) = [0, t + n), if (m, n) (s, t) = ;

9. (m, n) (s, + ) = (s n, + ), if s n;

10. (m, n) (s, + ) = [0, + ), if s < n;

11. (m, + ) (s, + ) = [0, + );

13. H1 H2 = H2 H1;

14. (H1 H2) H = (H1 H) (H2 H);

15. (S k Hk) H = S k(Hk H), where k N>0;

16. (H1 H2) H = (H1 H) (H2 H), if (H1 H2) = ;

17. (H1 H2) H = H1 (H2 H).

In Lemma 16, Rule 14 is a special case of Rule 15, where k = 2. Rule 16 states that the composition operation is distributive over non-empty intersections of intervals. It is necessary to require H1 H2 = , otherwise the property may not hold. For example, if H1 = [0, 1], H2 = [2, 3], H = [1, 2], then (H1 H2) H = whilst (H1 H) (H2 H) = [0, 3] [0, 5] = . A similar property is deﬁned by Li, Long, Liu, Duckham, and Both (2015) for RCC relations. The proofs for the last three calculation rules are provided in Appendix A, whilst others are more obvious. For an interval h of the form (l, u), [l, u), (l, u] or [l, u], we call l the greatest lower bound of h, represented as glb(h), and u the least upper bound of h, represented as lub(h). Below we show some interesting properties regarding the composition of intervals and their greatest lower/least upper bounds.

Lemma 17 For any non-negative non-empty intervals g, h, the following properties hold:

Qualitative Spatial Logics for Buffered Geometries

1. lub(g h) = lub(g) + lub(h);

2. glb(g h) max(glb(g), glb(h)).

Proof. Follows from Lemma 16.

A non-empty interval h is right-closed, iﬀh = [x, y] or h = (x, y]. h is right-open, iﬀ h = [x, y) or h = (x, y). h is right-inﬁnite, iﬀh = [x, ) or h = (x, ). h is left-closed, iﬀ h = [x, y] or h = [x, y). h is left-open, iﬀh = (x, y] or h = (x, y).

Lemma 18 Let g1, g2, g3 be non-negative non-empty right-closed intervals, if g1 g2 g3, then lub(g1) lub(g2) + lub(g3).

Proof. Suppose g1 g2 g3. Since lub(g1) g1, lub(g1) g2 g3. By Lemma 15, there exist d2 g2, d3 g3, such that lub(g1) d2 + d3. Since d2 lub(g2), d3 lub(g3), lub(g1) lub(g2) + lub(g3).

Lemma 19 Let g1, g2, g3 be non-negative non-empty intervals, g1 g2 g3. If g1 is rightinﬁnite, then g2 or g3 is right-inﬁnite.

Proof. Suppose g1 is right-inﬁnite. Since g1 g2 g3, g2 g3 is right-inﬁnite. By Deﬁnition 7 and Lemma 16, g2 or g3 is right-inﬁnite.

5.2.2 Distance Constraints in D(Σ+) and DS(Σ+)

In this section, we characterize the distance constraints which appear in the process of enforcing path-consistency on D(Σ+) in two main steps:

Step 1 We characterize the intervals involved in D(Σ+), as well as the composition of those intervals. This step is described by Deﬁnition 14 and Lemmas 20-24.

Step 2 We introduce the notion of DS(Σ+) as a set containing all distance constraints appearing in the process of enforcing path-consistency on D(Σ+), and characterize the distance constraints in DS(Σ+). This step is described by Deﬁnitions 15-17 and Lemmas 25-31.

Deﬁnition 14 (Primitive, Composite, Deﬁnable Intervals) Let h be a non-negative interval. h is primitive, if h is one of [0, σ], (σ, ), [0, 2σ], (2σ, ), (2σ, 4σ], (4σ, ), [0, ). h is composite, if it can be obtained as the composition of at least two primitive intervals. h is deﬁnable, if it is primitive or composite.

Lemma 20 For any non-negative interval h, h {0} = h.

Proof. Follows from Deﬁnition 7.

Since Lemma 20 holds, we call {0} an identity interval.

Du & Alechina

Lemma 21 If an interval occurs in D(Σ+), then it is an identity interval or a primitive interval.

Proof. Follows from Deﬁnition 14 and Lemma 11.

Lemma 22 If h is a deﬁnable interval, then h = .

Proof. Follows from Deﬁnition 14 and Deﬁnition 7.

Lemma 23 If an interval h is deﬁnable, then the following properties hold:

1. glb(h) = nσ, n {0, 1, 2, 3, 4};

2. lub(h) = + or lub(h) = mσ, m N>0.

Proof. Let us prove by induction on the structure of h. Base case: h is primitive. By Deﬁnition 14, n {0, 1, 2, 4}, lub(h) = + or m {1, 2, 4}. Inductive step: Suppose Properties 1, 2 hold for any interval ht which can be obtained as the composition of t primitive intervals, where t N>0 (induction hypothesis). We will show Properties 1, 2 hold for any interval ht+1 which can be obtained as the composition of (t + 1) primitive intervals. For any ht+1, there exist an ht and a primitive interval hp such that ht+1 = ht hp. By induction hypothesis, glb(ht) = ntσ, nt {0, 1, 2, 3, 4}; lub(ht) = + or lub(ht) = mtσ, mt N>0. From the base case, glb(hp) = npσ, np {0, 1, 2, 4}; lub(hp) = + or lub(hp) = mpσ, mp {1, 2, 4}. By Lemma 17, lub(ht+1) = lub(ht) + lub(hp). Thus, Property 2 holds. By Lemma 16, if

lub(ht) < glb(hp), then glb(ht+1) = glb(hp) lub(ht);

lub(hp) < glb(ht), then glb(ht+1) = glb(ht) lub(hp);

otherwise, glb(ht+1) = 0.

Since mt > 0 and mp > 0, for glb(ht+1) = nt+1σ, nt+1 < 4. In each case, nt+1 {0, 1, 2, 3} (Property 1 holds).

Lemma 24 If h is an identity or deﬁnable interval, then:

1. lub(h) = 0, iﬀh = {0};

2. lub(h) = σ, iﬀh = [0, σ];

3. glb(h) = 4σ, iﬀh = (4σ, ).

Proof. Follows from Lemma 17, Lemma 23 and its proof.

Now we start to characterize the distance constraints which appear in the process of enforcing path-consistency on D(Σ+).

Qualitative Spatial Logics for Buffered Geometries

Deﬁnition 15 (DS(Σ+)) DS(Σ+) is a minimal set of distance constraints such that the following holds:

Any distance constraint in D(Σ+) is in DS(Σ+);

If distance constraints d(p, q) h and d(q, s) g are in DS(Σ+), then d(p, s) h g is in DS(Σ+);

If distance constraints d(p, q) h and d(p, q) g are in DS(Σ+), then d(p, q) h g is in DS(Σ+),

where p, q, s are constants in D(Σ+).

In the deﬁnition above, DS(Σ+) is required to be minimal, such that any interval involved in DS(Σ+) is either in D(Σ+) or is obtained by applying composition or intersection operations on intervals in D(Σ+). For generality, we do not restrict p, q, s to be diﬀerent constants. For example, it is possible p = q.

Lemma 25 If a distance constraint appears in the process of enforcing path-consistency on D(Σ+), then it is in DS(Σ+).

Proof. Follows from Deﬁnition 8 (path-consistency) and Deﬁnition 15.

DS(Σ+) covers all the distance constraints appearing in the process of enforcing pathconsistency on D(Σ+). However, not every distance constraint in DS(Σ+) necessarily appears in the process of enforcing path-consistency on D(Σ+). For example, if D(Σ+) contains exactly one distance constraint d(p, p) [0, σ], then by Deﬁnition 15, d(p, p) [0, 2σ] is in DS(Σ+) (so is d(p, p) [0, nσ], for any n N>0), but by Deﬁnition 8, d(p, p) [0, 2σ] does not appear in the process of enforcing path-consistency. It is easy to see that DS(Σ+) is an inﬁnite set. The concept of DS(Σ+) is similar to the concept of distributive subalgebra deﬁned by Li et al. (2015), as the composition operation distributes over non-empty intersections of intervals involved in DS(Σ+) (Rule 16 in Lemma 16). However, in our work, the composition operation is deﬁned for intervals rather than relations.

Lemma 26 If a distance constraint d(p, q) h is in DS(Σ+), then h is a non-negative interval.

Proof. If a distance constraint d(p, q) h is in D(Σ+), by Lemma 11, h is a non-negative interval. By Deﬁnitions 5, 7 and the deﬁnition of intersection, applying composition or intersection on non-negative intervals, we obtain non-negative intervals. By Deﬁnition 15, h is a non-negative interval.

Diﬀering from the previous version (Du & Alechina, 2014b), the following deﬁnitions and lemmas are restricted to non-empty intervals. Recall that a non-empty interval h is right-closed, iﬀh = [x, y] or h = (x, y]. h is right-open, iﬀh = [x, y) or h = (x, y). h is right-inﬁnite, iﬀh = [x, ) or h = (x, ). h is left-closed, iﬀh = [x, y] or h = [x, y). h is left-open, iﬀh = (x, y] or h = (x, y).

Du & Alechina

Lemma 27 If a distance constraint d(p, q) h is in DS(Σ+) and h = , then h is either right-inﬁnite or right-closed.

Proof. Let n denote the total number of times of applying composition or intersection to obtain h, n 0. We prove by induction on n. Base case: n = 0, then d(p, q) h is in D(Σ+). By Lemma 11, h is either right-inﬁnite or right-closed. Inductive step: Suppose the statement holds for any non-empty h which can be obtained by applying composition or intersection no more than n times (induction hypothesis). We will show it also holds for any non-empty h which can be obtained by applying composition or intersection (n + 1) times.

If the last step to obtain h is intersection, then by Deﬁnition 15, there exist non-empty h1, h2 such that h = h1 h2. By induction hypothesis, for each hi, i {1, 2}, hi is either right-inﬁnite or right-closed. By intersection rules, h is either right-inﬁnite or right-closed.

If the last step to obtain h is composition, then by Deﬁnition 15, there exist nonempty h1, h2 such that h = h1 h2. By induction hypothesis, for each hi, i {1, 2}, hi is either right-inﬁnite or right-closed. By composition rules (Lemma 16), h is either right-inﬁnite or right-closed.

Lemma 28 For a distance constraint d(p, q) h in DS(Σ+) and h = , if glb(h) = 0, then h is left-open.

Proof. Let n denote the total number of times of applying composition or intersection to obtain h, n 0. We prove by induction on n. Base case: n = 0, then d(p, q) h is in D(Σ+). By Lemma 11, if glb(h) = 0, then h is left-open. Inductive step: Suppose the statement holds for any non-empty h which can be obtained by applying composition or intersection no more than n times (induction hypothesis). We will show it also holds for any non-empty h which can be obtained by applying composition or intersection (n + 1) times.

If the last step to obtain h is intersection, then by Deﬁnition 15, there exist nonempty h1, h2 such that h = h1 h2. By induction hypothesis, for each hi, i {1, 2}, if glb(hi) = 0, then hi is left-open. By intersection rules, if glb(h) = 0, then h is left-open.

If the last step to obtain h is composition, then by Deﬁnition 15, there exist non-empty h1, h2 such that h = h1 h2. If glb(h) = 0, then by composition rules (Lemma 16), h1 h2 = . Suppose lub(h1) glb(h2), then glb(h) = glb(h2) lub(h1). By Lemma 26 and glb(h) = 0, we have glb(h) > 0, thus glb(h2) > lub(h1). By Lemma 26, lub(h1) 0, thus glb(h2) > 0. By induction hypothesis, h2 is left-open. By composition rules (Lemma 16), h is left-open. Similarly, this also holds if lub(h2) glb(h1).

Qualitative Spatial Logics for Buffered Geometries

For a distance constraint d(p, q) h in DS(Σ+), by Deﬁnition 15, h is obtained by applying the composition and/or intersection operations n 0 times on intervals occurring in D(Σ+). As applying the intersection operation does not generate any new bound, the greatest lower/least upper bound (and its openness) of h must be the same as that of an interval in D(Σ+) or the composition of intervals in D(Σ+). We formalize this rationale as the concepts of Left-Deﬁnable and Right-Deﬁnable to characterize the distance constraints in DS(Σ+). Later, we will show that every distance constraint d(p, q) h (h = ) in DS(Σ+) is left-deﬁnable and right-deﬁnable. Left-Deﬁnable and Right-Deﬁnable are key concepts for proving the Path-Consistency Lemma, as they establish the correspondences between a distance constraint in DS(Σ+) and a sequence of distance constraints in D(Σ+). If a non-empty interval h is left-open, then its greatest lower bound is represented as glb (h). If h is left-closed, then its greatest lower bound is represented as glb+(h). If h is right-open, then its least upper bound is represented as lub (h). If h is right-closed, then its least upper bound is represented as lub+(h).

Deﬁnition 16 (Left-Deﬁnable) A distance constraint d(p1, pn) hs (n > 1) is leftdeﬁnable, iﬀhs = and there exists a sequence of distance constraints d(pi, pi+1) hi (0 < i < n) in D(Σ+), such that for m = h1 ... hn 1, the following holds:

1. If hs is left-open, then m is left-open and glb (m) = glb (hs);

2. If hs is left-closed, then m is left-closed and glb+(m) = glb+(hs);

Deﬁnition 17 (Right-Deﬁnable) A distance constraint d(p1, pn) hs (n > 1) is rightdeﬁnable, iﬀhs = and there exists a sequence of distance constraints d(pi, pi+1) hi (0 < i < n) in D(Σ+), such that for m = h1 ... hn 1, the following holds:

1. If hs is right-open, then m is right-open and lub (m) = lub (hs);

2. If hs is right-closed, then m is right-closed and lub+(m) = lub+(hs);

It is important to distinguish the deﬁnition of left-deﬁnable/right-deﬁnable distance constraints (Deﬁnitions 16 and 17) from Deﬁnition 14 (Deﬁnable Intervals). For example, if distance constraints d(p1, p2) {0} and d(p2, p3) {0} are in D(Σ+), then d(p1, p3) {0} is left-deﬁnable and right-deﬁnable, but {0} is not a deﬁnable interval. If distance constraints d(p1, p2) [0, σ] and d(p2, p3) (4σ, ) are in D(Σ+), then d(p1, p3) (3σ, 5σ] is leftdeﬁnable, but (3σ, 5σ] is not a deﬁnable interval.

Lemma 29 Let h, g be non-negative intervals. If distance constraints d(p, q) h and d(q, s) g are left-deﬁnable and right-deﬁnable, then d(p, s) h g is left-deﬁnable and right-deﬁnable.

Du & Alechina

Proof. Since d(p, q) h and d(q, s) g are right-deﬁnable, then by Deﬁnition 17, h = , g = . By Deﬁnition 7, h g = . By Deﬁnition 17, in D(Σ+), there exist a sequence of distance constraints d(p, x2) h1, ..., d(xn 1, q) hn 1 for d(p, q) h and a sequence of distance constraints d(q, y2) g1, ..., d(yt 1, s) gt 1 for d(q, s) g respectively satisfying the three properties. Let us take the union of the two sequences as a new one, this is, d(p, x2) h1, ..., d(xn 1, q) hn 1, d(q, y2) g1, ..., d(yt 1, s) gt 1. By composition rules (Lemma 16), the new sequence satisﬁes the properties in Deﬁnition 17 for d(p, s) h g. Hence, d(p, s) h g is right-deﬁnable. By composition rules (Lemma 16), if h g = , then glb+(h g) = 0. We can use the same new sequence above. Let m1 = (h1 ... hn 1), m2 = (g1 ... gt 1). By Deﬁnition 17, h m1, g m2. Then m1 m2 = , therefore, glb+(m1 m2) = 0. By Deﬁnition 7, h g m1 m2. By Deﬁnition 16, d(p, s) h g is left-deﬁnable. If h g = , let us suppose glb(h) lub(g). Since d(p, q) h is left-deﬁnable and d(q, s) g is right-deﬁnable, by Deﬁnitions 16 and 17 respectively, in D(Σ+), there exist a sequence of distance constraints for d(p, q) h and a sequence of distance constraints for d(q, s) g, satisfying the corresponding properties. Then by composition rules (Lemma 16), the union of the two sequences satisﬁes the properties in Deﬁnition 16 for d(p, s) h g. Hence, d(p, s) h g is left-deﬁnable. Similarly, we can show d(p, s) h g is left-deﬁnable, if glb(g) lub(h).

Lemma 30 Let h, g be non-negative intervals. If distance constraints d(p, q) h and d(p, q) g are left-deﬁnable and right-deﬁnable, h g = , then d(p, q) h g is leftdeﬁnable and right-deﬁnable.

Proof. As applying intersections does not generate any new bound and h g = , the left/right bound of h g is the same as that of h or g. If the left bound of h g is the same as that of h, then by Deﬁnition 16, the same sequence used for showing d(p, q) h is left-deﬁnable can be used to show d(p, q) h g is left-deﬁnable. Other cases are similar.

Lemma 31 If a distance constraint d(p, q) h is in DS(Σ+) and h = , then it is leftdeﬁnable and right-deﬁnable.

Proof. Let n denote the total number of times of applying composition or intersection to obtain h, n 0. We prove by induction on n. Base case: n = 0, then d(p, q) h is in D(Σ+). By Deﬁnitions 16 and 17, d(p, q) h is left-deﬁnable and right-deﬁnable. Inductive step: Suppose the statement holds for any non-empty h which can be obtained by applying composition or intersection no more than n times (induction hypothesis). We will show it also holds for any non-empty h which can be obtained by applying composition or intersection (n + 1) times. By Deﬁnition 15, the last operation to obtain h is either composition or intersection. In the former case, there exist d(p, s) g1 and d(s, q) g2 in DS(Σ+), such that g1 g2 = h. As h = , by Deﬁnition 7, gi = , i {1, 2}. Since g1 and g2 are obtained by applying composition or intersection no more than n times, then by induction hypothesis, d(p, s) g1 and d(s, q) g2 are left-deﬁnable and right-deﬁnable. By

Qualitative Spatial Logics for Buffered Geometries

Lemma 29, d(p, q) h is left-deﬁnable and right-deﬁnable. In the latter case, there exist d(p, q) g1 and d(p, q) g2 in DS(Σ+), such that g1 g2 = h. As h = , by intersection rules, gi = , i {1, 2}. By induction hypothesis, d(p, q) g1 and d(p, q) g2 are leftdeﬁnable and right-deﬁnable. By Lemma 30, d(p, q) h is left-deﬁnable and right-deﬁnable.

For generality, we do not exclude the possibility that d(p, q) is in DS(Σ+). However, it follows from the proof of the Path-Consistency Lemma in Section 5.3 that d(p, q) is not in DS(Σ+). Alternatively, for a direct proof, see Lemmas 45 and 46 in Appendix C.

5.2.3 Proving the Metric Space Lemma

In the following, we show there is a metric space satisfying D(Σ+), if D(Σ+) is pathconsistent (Metric Space Lemma). Firstly, we show that the process of enforcing patchconsistency on D(Σ+) terminates. By Lemma 13, the number of constants in D(Σ+) is ﬁnite. Let us suppose the number of constants in D(Σ+) is t N>0.

Lemma 32 Let t N>0 be the number of constants in D(Σ+). For any non-empty right-closed interval h referred to in the process of enforcing path-consistency on D(Σ+), lub+(h) 4σt.

Proof. If a non-empty right-closed interval h occurs in D(Σ+), then by Lemma 11, lub+(h) 4σ 4σt. Otherwise, it is generated from the application of composition and/or intersection operators by Deﬁnition 8. Composition creates larger least upper bounds (Lemma 17), whilst intersection does not. Since h is right-closed, lub+(h) is obtained by composing right-closed intervals only (Lemma 16). Over t constants, the longest chain involves (t 1) intervals. lub+(h) is maximal if we use all of these (t 1) intervals and the least upper bound of each interval is 4σ. Thus, lub+(h) 4σt.

Lemma 33 Let t N>0 be the number of constants in D(Σ+). Enforcing path-consistency on D(Σ+), a ﬁxed point can be reached in O(t3).

Proof. By Deﬁnition 8, Lemmas 23 and the fact that intersection does not generate new bounds, for any interval s appearing in the process of enforcing path-consistency on D(Σ+), the following properties hold:

1. glb(s) = nσ, n {0, 1, 2, 3, 4};

2. lub(s) = + or lub(s) = mσ, m N>0.

For any interval h appearing in D(Σ+), by enforcing path-consistency (Deﬁnition 8), h can only become an h h. By Lemma 11, h = . By Lemma 27, h is either right-closed or right-inﬁnite, h is , right-closed or right-inﬁnite.

If h is right-closed, then h = or h is right-closed. If h is right-closed, then by Lemma 11, lub(h ) lub(h) 4σ. By Properties 1, 2, there are ﬁnitely many possibilities for h .

Du & Alechina

If h is right-inﬁnite, then h is , right-closed or right-inﬁnite.

If h is right-closed, then by Lemma 32, lub(h ) 4tσ. By Properties 1, 2, there are ﬁnitely many possibilities for h .

If h is right-inﬁnite, then by Property 1, there are ﬁnitely many possibilities for its greatest lower bound, thus for h .

Since in each case, there are ﬁnitely many possibilities for h , a ﬁxed point is always reached. Suppose the widest non-negative interval [0, ) appears in the process of enforcing path-consistency on D(Σ+). In the worst case, ﬁrstly, [0, ) is strengthened to [0, u], where u 4tσ (by Lemma 32), then [0, u] is strengthened by σ each time. Hence, [0, ) can be strengthened at most (4t + 1) times. Over t constants, by Deﬁnition 13, there are O(t2) distance constraints in D(Σ+). For any interval h appearing in D(Σ+), h [0, ), hence h can be strengthened at most (4t + 1) times. Therefore, the total time of strengthening all the distance constraints is O(t3).

The following lemma shows how to construct a metric space from D(Σ+). It is used to prove the Metric Space Lemma.

Lemma 34 Let t N>0 be the number of constants in D(Σ+), Df(Σ+) be a ﬁxed point of enforcing path consistency on D(Σ+). If D(Σ+) is path-consistent, Ds(Σ+) is obtained from Df(Σ+) by replacing every right-inﬁnite interval with {5tσ}, every right-closed interval h with {lub(h)}, then Ds(Σ+) is path-consistent.

Proof. Suppose D(Σ+) is path-consistent. By Lemma 25, Df(Σ+) DS(Σ+). By Deﬁnition 8, for any interval h appearing in Df(Σ+), h = . By Lemma 27, h is either right-inﬁnite or right-closed. To prove Ds(Σ+) is path-consistent, we only need to show that for any three distance ranges, {npq}, {nqs}, {nps} in Ds(Σ+) over three constants p, q, s, we have

1. npq nqs + nps;

2. nqs npq + nps;

3. nps npq + nqs.

Let hpq, hqs, hps denote the corresponding distance ranges of {npq}, {nqs}, {nps} respectively in Df(Σ+), by Deﬁnition 8, we have

hpq hqs hps;

hqs hpq hps;

hps hpq hqs.

We prove Ds(Σ+) is path-consistent by cases:

If every hi (i {pq, qs, ps}) is right-closed, then ni = lub(hi). By Lemma 18, 1-3 hold.

Qualitative Spatial Logics for Buffered Geometries

Otherwise, not all of them are right-closed. By Lemma 19, at least two of them are right-inﬁnite.

If all of them are right-inﬁnite, then ni = 5tσ. Since 5tσ 5tσ + 5tσ, 1-3 hold.

Otherwise, only one of them is right-closed. Let hpq be right-closed. Then, npq = lub(hpq), nqs = 5tσ, nps = 5tσ. By Lemma 32 and σ R 0, lub(hpq) 4tσ < 5tσ. By Lemma 26, lub(hpq) 0. Since lub(hpq) < 5tσ + 5tσ and 5tσ 5tσ + lub(hpq), 1-3 hold.

Lemma 7 (Metric Space Lemma) Let Σ be a ﬁnite consistent set of formulas, and Σ+

be an MCS which contains Σ and is over the same set of individual names as Σ. If D(Σ+) is path-consistent, then there is a metric space ( , d) such that all distance constraints in D(Σ+) are satisﬁed.

Proof. Suppose D(Σ+) is path-consistent. Let be the set of constants in D(Σ+), which is used to interpret individual names occurring in Σ, as shown in Deﬁnition 13. If = , then it is trivial. Let us assume = . The number of constants in is denoted by t N>0. By Lemma 33, a ﬁxed point Df(Σ+) can be reached by enforcing path-consistency on D(Σ+). Let Ds(Σ+) be a set of distance constraints obtained from Df(Σ+) by replacing every right-inﬁnite interval with {5tσ}, every right-closed interval h with {lub(h)}. Since every distance constraint in Ds(Σ+) is of the form d(p, q) {r}, where r R 0, and d(p, q) {r} is equivalent to d(p, q) = r, a metric (distance function) is deﬁned over . By Deﬁnition 13 and Lemma 34, for any pair of constants x, y, if x = y, then d(x, y) = 0 holds in Ds(Σ+); if x = y, then d(x, y) σ > 0 holds in Ds(Σ+). Thus, we have d(x, y) = 0 iﬀx = y in Ds(Σ+). By Deﬁnitions 13 and 8, for any pair of constants x, y, d(x, y) = d(y, x) holds in Ds(Σ+). By Lemma 34, Ds(Σ+) is path-consistent. Thus, for any constants x, y, z, d(x, z) d(x, y) + d(y, z) holds in Ds(Σ+). By Deﬁnition 1, the ( , d) of Ds(Σ+) is a metric space such that all distance constraints in D(Σ+) are satisﬁed.

5.3 Path-Consistency Lemma

This section proves the Path-Consistency Lemma by contradiction, supposing that D(Σ+) is not path-consistent. We examine every case where the ﬁrst interval is obtained by enforcing path-consistency. In each case, we show that is derivable from the corresponding LBPT formulas in Σ+ using LBPT axioms. This contradicts the assumption that Σ+ is consistent. Lemma 35 is used to generate all possible cases and make sure no duplicated ones are generated. By using Lemma 35, the proof of the Path-Consistency Lemma is largely simpliﬁed, compared to the previous version (Du & Alechina, 2014b).

Lemma 35 Let g, h be non-negative intervals. g h = iﬀ(g h) {0} = .

Proof. If g h = , then by Deﬁnition 7, 0 (g h). If 0 (g h), then by Lemma 15, there exist d1 g, d2 h such that 0 [|d1 d2|, d1+d2]. Thus, d1 = d2. Therefore, g h = .

Du & Alechina

Since g h = iﬀ0 (g h), by contraposition we get g h = iﬀ(g h) {0} = .

Knowing a least upper bound or a greatest lower bound of a deﬁnable interval h, Lemmas 36-42 show all possible ways in which h can be obtained as the composition of primitive intervals. Lemma 36 and Lemma 39 are proved below. Proofs of the other lemmas are similar and omitted.

Lemma 36 If an interval h is deﬁnable, lub(h) = 2σ, then h is a primitive interval [0, 2σ] or h is obtained as the composition of two [0, σ].

Proof. If h is primitive, then by Deﬁnition 14, h = [0, 2σ]. If h is composite, then by Deﬁnition 14, there exist two deﬁnable intervals g1, g2 such that g1 g2 = h. By Lemma 17, lub(g1) + lub(g2) = 2σ. By Lemma 23, lub(g1) σ, lub(g2) σ, thus lub(g1) = σ, lub(g2) = σ. By Lemma 24, h = [0, σ] [0, σ].

Lemma 37 If an interval h is deﬁnable, lub(h) = 3σ, then h is obtained as the composition of [0, σ] and [0, 2σ] or as the composition of three [0, σ].

Lemma 38 If an interval h is deﬁnable, lub(h) = 4σ, then h is a primitive interval (2σ, 4σ], or h is obtained as the composition of two [0, 2σ], as the composition of two [0, σ] and one [0, 2σ] or as the composition of four [0, σ].

Lemma 39 If an interval h is deﬁnable, glb(h) = 3σ, then h is obtained as the composition of [0, σ] and (4σ, ).

Proof. By Deﬁnition 14, h cannot be primitive. Since h is composite, then by Deﬁnition 14, there exist two deﬁnable intervals g1, g2 such that g1 g2 = h. g1 g2 = , otherwise, by Lemma 16, glb(h) = 0. Without loss of generality, let us suppose lub(g1) glb(g2). By Lemma 16, glb(g2) lub(g1) = 3σ. By Lemma 23, glb(g2) 4σ, lub(g1) σ, thus glb(g2) = 4σ, lub(g1) = σ. By Lemma 24, h is obtained as the composition of [0, σ] and (4σ, ).

Lemma 40 If an interval h is deﬁnable, glb(h) = 2σ, then h is a primitive interval (2σ, ) or (2σ, 4σ], or h is obtained as the composition of [0, 2σ] and (4σ, ) or as the composition of two [0, σ] and one (4σ, ).

Lemma 41 If an interval h is deﬁnable, glb(h) = σ, then h is a primitive interval (σ, ), or h is obtained as the composition of [0, σ] and (2σ, ), as the composition of [0, σ] and (2σ, 4σ], as the composition of one [0, σ], one [0, 2σ] and one (4σ, ), or as the composition of three [0, σ] and one (4σ, ).

Lemma 42 If an interval h is deﬁnable and left-open, glb(h) = 0, then h is obtained in exactly the following ways:

as the composition of [0, σ] and (σ, );

Qualitative Spatial Logics for Buffered Geometries

as the composition of [0, 2σ] and (2σ, );

as the composition of two [0, σ] and one (2σ, );

as the composition of [0, 2σ] and (2σ, 4σ];

as the composition of two [0, σ] and one (2σ, 4σ];

as the composition of (2σ, 4σ] and (4σ, );

as the composition of two [0, 2σ] and one (4σ, );

as the composition of two [0, σ], one [0, 2σ] and one (4σ, );

as the composition of four [0, σ] and one (4σ, ).

In our previous work (Du et al., 2013; Du & Alechina, 2014b), we presented a slightly diﬀerent way to prove the Path-Consistency Lemma for LNF and LBPT respectively: the ﬁrst empty interval is obtained using the strengthening operator, this is, g1 (g2 g3) = and gi = , where i {1, 2, 3}. gi may be {0} or a primitive interval, or it can be written as xi (yi zi), where each of xi, yi, zi may not be an identity or primitive internal also. In the latter case, since gi = xi (yi zi) = , xi, yi, zi are all not empty. Since the composition operation is distributive over non-empty intersections of intervals (Rule 16 in Lemma 16), we use Rule 16 repeatedly to rewrite g1 (g2 g3) until every interval is an identity or primitive interval. The ﬁnal form is h1 ... hn = , n > 1, where hx (0 < x n) is {0} or a deﬁnable interval. Thus there exist two intervals hi, hj (0 < i n, 0 < j n, i = j) such that hi hj = . Then we look at all the diﬀerent combinations such that lub(hi) glb(hj). There are exactly 15 such combinations. In this paper, the proof of the Path-Consistency Lemma is largely simpliﬁed. It shows that it is suﬃcient to examine 5 rather than 15 combinations.

Lemma 8 (Path-Consistency Lemma) Let Σ be a ﬁnite consistent set of formulas, and Σ+ be an MCS which contains Σ and is over the same set of individual names as Σ. Then, D(Σ+) is path-consistent.

Proof. Suppose D(Σ+) is not path-consistent. Then by Deﬁnitions 8, 15 and Lemma 25, d(p, q) is in DS(Σ+), for some constants p, q. By Lemma 11, for any distance range g occurring in D(Σ+), g = . By Deﬁnitions 15, 7, and intersection rules, the last operation to obtain the ﬁrst interval is intersection. By Deﬁnition 15, there exist d(p, q) g1 and d(p, q) g2 in DS(Σ+), g1 = , g2 = , and g1 g2 = . By Lemma 26, g1, g2 are non-negative intervals. By Lemma 35, g1 g2 = iﬀ(g1 g2) {0} = . By Deﬁnition 13 and Deﬁnition 15, d(q, p) g2 is in DS(Σ+). Since d(p, q) g1 is in DS(Σ+), by Deﬁnition 15, d(p, p) (g1 g2) is in DS(Σ+). By Deﬁnition 7, g1 g2 = . By Lemma 31, d(p, p) (g1 g2) is left-deﬁnable and right-deﬁnable. Let h = g1 g2. Since d(p, p) h is left-deﬁnable, then by Deﬁnition 16, there exists a sequence of distance constraints d(pi, pi+1) hi (0 < i < n) in D(Σ+), such that p = p1 = pn and for h = h1 ... hn 1, h and h have the same greatest lower bound (including both value and

Du & Alechina

openness) and h h . By Deﬁnition 14, Lemmas 21 and 20, h is an identity or deﬁnable interval. By Lemma 23, glb(h ) {0, σ, 2σ, 3σ, 4σ}. Therefore, (g1 g2) {0} = iﬀone of the following holds:

glb(h) {σ, 2σ, 3σ, 4σ};

h is left-open and glb (h) = 0.

We will check whether can be derived in every case using axioms (or derivable facts). By Axiom 3 and Axiom 4, NEAR and FAR are both symmetric.

1. glb(h) = σ: we look at all the diﬀerent ways where h is obtained from a sequence of distance constraints d(pi, pi+1) hi (0 < i < n) in D(Σ+) such that p = p1 = pn and h = h1 ... hn 1 (see Deﬁnition 16). As every hi is {0} or a primitive interval (by Lemma 21), Lemma 41 speciﬁes all the diﬀerent ways to obtain h :

(a) h is a primitive interval (σ, + ): by Deﬁnition 16, d(p1, pn) (σ, + ) is in D(Σ+) and n = 2. As p = p1 = pn, d(p, p) (σ, + ) is in D(Σ+). Suppose p points(a) for an individual name a in Σ. By the proof of Lemma 11, (σ, + ) only can come from formulas of the form ξ(x, y, (σ, )), where x, y are individual names. By Deﬁnition 10, ξ(x, y, (σ, )) only can come from BPT(x, y). Since d(p, p) (σ, + ) is in D(Σ+) and p points(a), BPT(a, a) Σ+. By Axiom 9, BPT(a, a) . (b) h is obtained as the composition of [0, σ] and (2σ, ) or as the composition of [0, σ] and (2σ, 4σ]: by the proof of Lemma 11 and Deﬁnition 10, BPT(a, b) Σ+ or BPT(b, a) Σ+, NEAR(a, b) Σ+ and NEAR(b, a) Σ+. By Fact 14, BPT(x1, x2) NEAR(x1, x2) , {x1, x2} = {a, b}. (c) h is obtained as the composition of one [0, σ], one [0, 2σ] and one (4σ, ): by the proof of Lemma 11 and Deﬁnition 10, BPT(a, b) Σ+ or BPT(b, a) Σ+, NEAR(b, c) Σ+, NEAR(c, b) Σ+, FAR(c, a) Σ+, FAR(a, c) Σ+. By Fact 16, BPT(x2, x1) NEAR(x2, x3) FAR(x3, x1) , {x1, x2, x3} = {a, b, c}. (d) h is obtained as the composition of three [0, σ] and one (4σ, + ): by the proof of Lemma 11 and Deﬁnition 10, we have three BPT and one FAR over four individual names a, b, c, d. BPT refers to either BPT(x, y) or BPT(y, x). Some cases (for example, in Σ+, we have BPT(a, b), BPT(c, b), BPT(d, c) and FAR(a, d)) are not valid, because diﬀerent constants will be taken from the same points(b), for an individual name b (by Deﬁnition 13). As a consequence, in an invalid case, a sequence consisting of distance constraints d(pi, pi+1) hi (0 < i < n, p = p1 = pn) cannot exist in D(Σ+). We only need to consider all valid cases, which are listed below. i. BPT(x1, x2), BPT(x2, x3), BPT(x3, x4), FAR(x4, x1), where {x1, x2, x3, x4} = {a, b, c, d}. By Fact 20, BPT(x1, x2) BPT(x2, x3) BPT(x3, x4) FAR(x4, x1) . ii. BPT(x2, x1), BPT(x2, x3), BPT(x3, x4), FAR(x4, x1), where {x1, x2, x3, x4} = {a, b, c, d}. By Fact 21, BPT(x2, x1) BPT(x2, x3) BPT(x3, x4) FAR(x4, x1) .

Qualitative Spatial Logics for Buffered Geometries

Cases 2-5 below use similar arguments. In the following proof, BPT refers to either BPT(x, y) or BPT(y, x) (whichever makes the corresponding case valid). NEAR and FAR are symmetric, thus the order of x, y does not matter.

2. glb(h) = 2σ: by Deﬁnition 16 and Lemma 21, Lemma 40 speciﬁes all the diﬀerent ways to obtain h from a sequence of distance constraints d(pi, pi+1) hi (0 < i < n) in D(Σ+):

(a) h is a primitive interval (2σ, ) or (2σ, 4σ]: NEAR(a, a), using Axiom 9 and Fact 14.

(b) h is obtained as the composition of [0, 2σ] and (4σ, + ) : one NEAR and one FAR, using Fact 15.

(c) h is obtained as the composition of two [0, σ] and one (4σ, + ): two BPT and one FAR, using Facts 18 and 19.

3. glb(h) = 3σ: by Deﬁnition 16 and Lemma 21, Lemma 39 speciﬁes all the ways to obtain h from a sequence of distance constraints d(pi, pi+1) hi (0 < i < n) in D(Σ+). By Lemma 39, h is obtained as the composition of [0, σ] and (4σ, + ). one BPT and one FAR, using Fact 17.

4. glb(h) = 4σ: by Deﬁnition 16 and Lemma 21, Lemma 24 speciﬁes all the ways to obtain h from a sequence of distance constraints d(pi, pi+1) hi (0 < i < n) in D(Σ+). By Lemma 24, h = (4σ, + ). FAR(a, a), using Axiom 9 and Fact 17.

5. glb (h) = 0: by Deﬁnition 16 and Lemma 21, Lemma 42 speciﬁes all the ways to obtain h from a sequence of distance constraints d(pi, pi+1) hi (0 < i < n) in D(Σ+):

(a) h is obtained as the composition of [0, σ] and (σ, ): by Deﬁnition 13, ensuring no diﬀerent constants taken from the same points(x), BPT(x1, x2) Σ+ and BPT(x1, x2) Σ+, {x1, x2} = {a, b}. BPT(x1, x2) BPT(x1, x2) .

(b) h is obtained as the composition of [0, 2σ] and (2σ, ) or as the composition of [0, 2σ] and (2σ, 4σ]: one NEAR and one NEAR, using Axiom 3.

(c) h is obtained as the composition of two [0, σ] and one (2σ, ) or as the composition of two [0, σ] and one (2σ, 4σ]: two BPT and one NEAR, using Axioms 10 and 11.

(d) h is obtained as the composition of (2σ, 4σ] and (4σ, ): one FAR and one FAR, using Axiom 4.

(e) h is obtained as the composition of two [0, 2σ] and one (4σ, ): two NEAR and one FAR. This case is invalid. By Deﬁnition 16, D(Σ+) contains d(pa, pb) [0, 2σ], d(pb, pc) [0, 2σ] and d(pa, pc) (4σ, ), where pa points(a), pb points(b), pc points(c), for individual names a, b, c. By Deﬁnitions 10 and 13, d(pa, pb) [0, 2σ] and d(pb, pc) [0, 2σ] cannot come from NEAR(a, b) and NEAR(b, c) in Σ+ (by

Du & Alechina

the proof of Lemma 11, it is clear that they cannot come from other formulas as well), because two diﬀerent constants will be taken from points(b) as witnesses for (a, b, [0, 2σ]) and (b, c, [0, 2σ]) respectively.

(f) h is obtained as the composition of two [0, σ], one [0, 2σ] and one (4σ, ): two BPT, one NEAR and one FAR, using Axioms 12 and 13.

(g) h is obtained as the composition of four [0, σ] and one (4σ, ): four BPT and one FAR, using Facts 22-24.

In each valid case, is derivable using the corresponding axioms or facts, which contradicts the assumption that Σ+ is consistent. Therefore, D(Σ+) is path-consistent.

There is an alternative way to prove the Path-Consistency Lemma, which we believe is longer and more complicated than the one presented in this paper, but since it may provide additional intuitions to the reader, we sketch it in Appendix B.

6. Decidability and Complexity of LBPT

In this section, we establish the complexity of the LBPT satisﬁability problem. The complexity of the LNF/LNFS satisﬁability problem can be established in a similar way. The complexity of these satisﬁability problems is important, as it is related to the complexity of the problem of ﬁnding inconsistencies, which is the basis of our approach to debugging matches between geospatial datasets.

Deﬁnition 18 (Size of a Formula) The size of a LBPT formula s(φ) is deﬁned as follows:

s(BPT(a, b)) = 3, s(NEAR(a, b)) = 3, s(FAR(a, b)) = 3;

s( φ) = 1 + s(φ);

s(φ ψ) = 1 + s(φ) + s(ψ),

where a, b are individual names, φ, ψ are formulas in L(LBPT).

As a set of LBPT formulas S and the conjunction of all formulas in S are equi-satisﬁable, the combined size of LBPT formulas in a set S is deﬁned as the size of the conjunction of all formulas in S. Next we prove Theorem 2 for LBPT: the satisﬁability problem for a ﬁnite set of LBPT formulas in a metric space is NP-complete.

Proof. NP-hardness of the LBPT satisﬁability problem follows from NP-hardness of the satisﬁability problem for propositional logic, which is included in LBPT. To prove that the LBPT satisﬁability problem is in NP, we show that if a ﬁnite set of LBPT formulas Σ is satisﬁable, then we can guess a metric model for Σ and verify that this model satisﬁes Σ, both in time polynomial in the combined size of formulas in Σ. Suppose Σ is a ﬁnite set of LBPT formulas, and the number of individual names in Σ is n. The completeness proof shows that, if Σ is satisﬁable, it is satisﬁable in a metric model

Qualitative Spatial Logics for Buffered Geometries

M of size which is polynomially bounded by the number of individual names in Σ. To recap the construction of the metric model for Σ, ﬁrst we construct B(Σ+), the corresponding set of basic quantiﬁed formulas from an MCS Σ+ containing Σ, and then construct a model for B(Σ+). By Deﬁnition 10, the number of formulas in B(Σ+) is at most f = (n + 2n(n 1)). By Deﬁnitions 11 and 13, to every individual name a in Σ, we assign a ﬁxed set of new constants, points(a) = {p1 a, . . . , px a}, where x = num(a, B(Σ+)). Since x f, the number of constants in M is at most t = nf. By Lemma 34 and proofs of the Metric Space Lemma, in such a model M, every value assigned by the distance function is of the form mσ, m N 0, m 5t. We guess a metric model M like this. Let s be the combined size of formulas in Σ. Then n < s. To every individual name a in Σ, we assign {p1 a, . . . , px a}, where x < 2s2. This results in a set of constants , the size of which is < 2s3. To every pair of constants p, q in , we assign mσ to d(p, q), where m N 0, m < 10s3. To verify that ( , d) is a metric space, by Deﬁnition 1, it is in O(s9). To verify that M satisﬁes Σ, we need to verify that it satisﬁes the conjunction of all formulas in Σ. For any R(a, b), where R {BPT, NEAR, FAR}, a, b are individual names, to verify that R(a, b) is satisﬁed, it takes time which is polynomial in |points(a)| |points(b)|, thus it is in O(s4). Hence, verifying that M satisﬁes Σ can be done in O(s5).

In Section 3, we mentioned that σ acts as a scaling factor of a metric model. This is stated as Lemma 43 and follows from the proofs of the completeness theorem and Theorem 2. The proof for LBPT is provided below. The proof for LNF/LNFS is similar.

Lemma 43 A ﬁnite set of LNF/LNFS/LBPT formulas is satisﬁable in a metric model where σ R 0, iﬀit is satisﬁable in a metric model when σ = 1.

Proof.[for LBPT] Suppose Σ is a ﬁnite set of LBPT formulas, the number of individual names in Σ is n and the combined size of formulas in Σ is s. By Deﬁnition 18, n < s. The completeness proof shows that, if Σ is satisﬁable, it is satisﬁable in a metric model M = ( , d, I, σ) constructed as shown in Section 5.1 and Section 5.2. By Deﬁnition 13, to every individual name a in Σ, we assign {p1 a, . . . , px a}, where x = num(a, B(Σ+)) < 2s2. This results in a set of constants , the size of which is < 2s3. To every constant p in , we assign d(p, p) = 0. By Deﬁnition 13 and Lemma 34, to every pair of diﬀerent constants p, q in , we assign mσ to d(p, q), where m N>0, m < 10s3. As M is a metric model, by Deﬁnition 1 and the proof of the Metric Space Lemma, for any x, y, z , we have

1. d(x, y) = 0, iﬀx = y;

2. d(x, y) = mxyσ, iﬀd(y, x) = mxyσ;

3. if d(x, z) = mxzσ, d(x, y) = mxyσ, d(y, z) = myzσ, then mxzσ mxyσ + myzσ.

If M satisﬁes Σ, by Deﬁnition 3, the following holds:

M |= BPT(a, b) iﬀ pa I (a) pb I (b) : d(pa, pb) [0, σ];

M |= NEAR(a, b) iﬀ pa I (a) pb I (b) : d(pa, pb) [0, 2σ];

Du & Alechina

M |= FAR(a, b) iﬀ pa I (a) pb I (b) : d(pa, pb) (4σ, );

M |= φ iﬀM |= φ;

M |= φ ψ iﬀM |= φ and M |= ψ,

where a, b are individual names, φ, ψ are formulas in L(LBPT). By setting σ = 1, ( , d) is still a metric space, as the following holds for any x, y, z :

1. d(x, y) = 0 iﬀx = y;

2. d(x, y) = mxy, iﬀd(y, x) = mxy;

3. if d(x, z) = mxz, d(x, y) = mxy, d(y, z) = myz, then mxz mxy + myz.

By setting σ = 1, the deﬁnitions of BPT, NEAR and FAR change accordingly as well. One can easily see that M still satisﬁes Σ after replacing every σ by 1. Similar, if we have a metric model with σ = 1, we can obtain a metric model with σ R 0 by multiplying every distance value m by σ. One can easily see that M still satisﬁes Σ after multiplying all distance values in M by σ, and multiplying all greatest lower bounds and least upper bounds of intervals in the truth deﬁnitions of BPT, NEAR and FAR by σ.

7. Validating Matches using Spatial Logic

The spatial logics LNF, LNFS and LBPT can be used to verify consistency of same As and part Of matches between spatial objects from diﬀerent geospatial datasets. If every spatial object has a point geometry, then we apply LNF, otherwise, we apply LNFS or LBPT. LBPT reasoning has been used together with description logic reasoning in a geospatial data matching system Match Maps (Du, Nguyen, Alechina, Logan, Jackson, & Goodwin, 2015; Du, 2015). LBPT reasoning and description logic reasoning complement each other in the sense that LBPT reasoning veriﬁes matches regarding spatial information whilst description logic reasoning veriﬁes matches regarding classiﬁcation information, the unique name assumption and a stronger version of it. In the following, we describe how LBPT is used for debugging matches. A dedicated LBPT reasoner integrated with an assumption-based truth maintenance system (ATMS) (de Kleer, 1986) was developed as a part of Match Maps. It implements LBPT axioms and the deﬁnition BEQ(a, b) BPT(a, b) BPT(b, a) as a set of inference rules. For eﬃciency reasons, there is no one-to-one correspondence between rules and axioms. To speed up matching and avoid cycles, facts such as NEAR(a, b) are stored for only one order of a and b, and symmetry axioms are removed. Each of the remaining axioms involving any symmetric relation gives rise to several rules, to compensate for the removal of symmetry. For example, the axiom

BPT(b, a) NEAR(b, c) BPT(c, d) FAR(d, a)

also gives rise to a rule corresponding to the following implication:

BPT(b, a) NEAR(c, b) BPT(c, d) FAR(d, a)

Qualitative Spatial Logics for Buffered Geometries

(with NEAR(c, b) instead of NEAR(b, c)). However the set of rules is trivially equivalent to the set of axioms. Possible matches of the form same As(a, b) and part Of (a, b) (a is part Of b) are generated as assumptions, to be withdrawn if they are involved in a derivation of a contradiction in description logic or in LBPT. In order to apply the LBPT reasoning, same As(a, b) is replaced by BEQ(a, b), and part Of (a, b) is replaced by BPT(a, b). These substitutions do not aﬀect correctness of the matching results of Match Maps, as Match Maps adopts the definitions of same As and part Of such that same As(a, b) entails BEQ(a, b) and part Of (a, b) entails BPT(a, b). NEAR(a, b) and FAR(a, b) facts are generated for those objects a, b in the same dataset that are involved in some matches across the two datasets (there is an object c in the other dataset such that BEQ(a, c), BPT(a, c) or BPT(c, a) holds, and similarly for b). The LBPT reasoner derives new formulas by applying inference rules to previously derived formulas, and the ATMS maintains dependencies between derived consequences and a set of assumptions (corresponding to possible matches). In particular, it maintains all minimal sets of assumptions responsible for the derivation of (false), referred to as nogoods in ATMS terminology. Such minimal sets of assumptions responsible for a contradiction are used to decide which matches are wrong and should be withdrawn. In experiments, the LBPT reasoner with an ATMS was used to validate matches between spatial objects from OSM data (building layer) and OSGB Master Map data (Address Layer and Topology Layer) (Du et al., 2015). The study areas are in city centres of Nottingham UK and Southampton UK. The Nottingham data was obtained in 2012, and the Southampton data in 2013. The numbers of spatial objects in the case studies are shown in Table 1.

OSM spatial objects OSGB spatial objects Nottingham 281 13204 Southampton 2130 7678

Table 1: Data used for Evaluation

The initial matches are generated by the matching method implemented in Match Maps. The detailed matching method is provided by Du et al. (2016). The method consists of two main steps: matching geometries and matching spatial objects. A spatial object in a geospatial dataset has an ID, location information (coordinates and geometry) and meaningful labels, such as names or types, and represents an object in the real world. A geometry here refers to a point, a line or a polygon, which is used to represent location information in geospatial datasets. The geometry matching requires a level of tolerance, as some diﬀerence in the geometry representation of a spatial object is to be expected in diﬀerent datasets. After discussing with domain experts in geospatial science, we decided to apply the same level of tolerance for the matching method and the spatial logic used in Match Maps. In the experiments described by Du et al. (2015), the level of tolerance for the geometry matching was set to be 20 meters, based on the published estimate about the positional accuracy of OSM data. The OSM positional accuracy was estimated to be about 20 meters in UK (Haklay, 2010). In our more recent work (Du et al., 2016), we analysed how the level of tolerance aﬀects

Du & Alechina

the precision and recall of matching results for the same geographic area in Nottingham (the same data as shown in the ﬁrst row of Table 1) using 12 diﬀerent levels of tolerance within a range of 1 to 80 meters. It shows that, for the Nottingham case, 20 meters is a good estimate, though it is not the optimal value. Following the ﬁrst step of the matching method, we ﬁrst aggregate adjacent single geometries, such as those of shops within a shopping center, then establish correspondences between aggregated geometries using geometry matching. In the second step, we match spatial objects located on these corresponding aggregated geometries by comparing the similarity of names and types of spatial objects in several diﬀerent cases (one-to-one, manyto-one and many-to-many). The most diﬃcult case is when there is a match between two aggregated geometries which contain objects {a1, . . . , an} in one dataset and objects {b1, . . . , bk} in the other dataset (many-to-many matching case). When we cannot decide the exact matches automatically using names and types of objects, we generate all matches which are possibly correct between the objects in the two sets: for each pair ai, bj with similar labels, we generate same As(ai, bj), part Of (ai, bj), part Of (bj, ai). We apply reasoning in LBPT and description logic to verify consistency of these matches. The use of description logic reasoning is described by Du (2015).

Figure 5: Examples of using LNFS and LBPT for validating matches

If a minimal set of statements involved in a contradiction contains more than one retractable assumption, a domain expert is needed to decide the correctness of the retractable assumptions and remove the wrong one(s) to restore consistency. Location information is visualized and provided to help domain experts make such decisions. As shown in Figure 5, a1, b1, c1, d1 (dotted) are from OSGB data and a2, b2, c2, d2 (solid) are from OSM data. In the left example, by LNFS Axiom 6 (or by LBPT Axiom 12 and BEQ(a, b) BPT(a, b) BPT(b, a)), a minimal set of statements for deriving an inconsistency consists of BEQ(a1, a2), BEQ(b1, b2), NEAR(a1, b1), FAR(a2, b2). It is clear that BEQ(b1, b2) is wrong, as NEAR(a1, b1) and FAR(a2, b2) are facts. In the right example, BPT(d2, d1) is wrong, because it contradicts BPT(c2, c1), NEAR(c2, d2), FAR(c1, d1) by LBPT Axiom 12. As a consequence, the same As and part Of matches corresponding to BEQ(b1, b2) and BPT(d2, d1) respectively are also incorrect. Table 2 shows the numbers of nogoods generated by the LBPT reasoner with an ATMS. As mentioned earlier, nogoods are justiﬁcations for false : the minimal sets of statements from which a contradiction is derivable. The number of interactions is the number of times users are asked to take actions or use strategies to resolve problems (a strategy is a heuristic which allows users to retract all similar statements at a time, for example,

Qualitative Spatial Logics for Buffered Geometries

nogoods retracted BEQ/BPT retracted same As/part Of interactions Nottingham 172 31 1325 3 Southampton 268 114 488 7

Table 2: LBPT Veriﬁcation of Matches

retracting part Of (o, x) for any x diﬀering from an object o). As a result of LBPT reasoning and removal of BEQ and BPT assumptions, we withdraw 1325 same As/part Of assumptions for the Nottingham case and 488 same As/part Of assumptions for the Southampton case. With the LBPT validation of matches, Match Maps achieved high precision ( 90%) and recall ( 84%) for both Nottingham and Southampton cases. As described in our previous work (Du, Alechina, Hart, & Jackson, 2015), Match Maps was used by 12 experts from the University of Nottingham and Ordnance Survey of Great Britain to match about 100 buildings and places in Southampton. A graphical user interface of Match Maps is provided allowing users to take diﬀerent types of actions to remove wrong matches. The number of actions and the decision time of users are recorded. The precision and recall of the matching results are compared to those obtained without using any user-involved veriﬁcation. Experimental results showed that by using reasoning in LBPT and description logic, the precision and recall of matches generated by Match Maps were improved on average by 9% and 8% respectively. The human eﬀort is also reduced, in the sense that the decision time required is much less than that of a fully manual matching process.

8. Discussion

The spatial logics LNF, LNFS and LBPT are generally applicable to reason with spatial objects whose locations are represented at diﬀerent levels of accuracy or granularity in diﬀerent datasets. Locations of spatial objects can be represented using vector data (coordinates) or raster data (images). Sometimes, for spatial objects in diﬀerent datasets, measuring whether their locations are buﬀered equal directly is diﬃcult or impossible, for example, when locations are represented as images without knowing their coordinates. In such cases, spatial objects may be matched by comparing shapes in images or using lexical information. No matter how the matches between spatial objects are generated, the LNF/LNFS/LBPT reasoning could be used to verify consistency of matches with regard to relative locations (NEAR/FAR facts) between spatial objects in the same dataset, which are often reliable and easy to capture. Another potential application of these logics is in matching non-georeferenced volunteered spatial information or sketch data (Egenhofer, 1997; Kopczynski, 2006; Wallgr un, Wolter, & Richter, 2010). Instead of being created by surveying or other mapping techniques, sketch data is often created by a person from memory or by schematizing authoritative geospatial data. A sketch map cannot provide precise metric information such as the exact distance or size of a spatial object, but it roughly shows several kinds of qualitative relations (e.g. nearness and directions) between spatial objects. In the work by Wallgr un et al. (2010), qualitative spatial reasoning (based on the dipole relation algebra presented in Moratz, Renz, & Wolter, 2000 for checking connectivity and the cardinal direction calculus

Du & Alechina

presented in Ligozat, 1998) is used for the task of matching a sketch map of a road network to a larger geo-referenced data set, for example, from Open Street Map. Endpoints and junctions of roads are extracted and their relative directions are represented and checked by spatial reasoning. The spatial logic LNF can be applied similarly to check the relative distances between endpoints or junctions of roads. NEAR/FAR relations between these points indicate the length of roads. For the task of matching a sketch map of polygonal objects (e.g. buildings and places), the logic LNFS/LBPT can be applied. Suppose users draw a sketch map of buildings and estimate the distances between some buildings as being NEAR or FAR regarding an agreed level of tolerance σ. The NEAR and FAR relations between buildings in a geo-referenced map can be calculated automatically. A mapping between the sketch map and a geo-referenced map can be checked by reasoning in the logic LNFS/LBPT. For example, two buildings which are speciﬁed as being FAR in the sketch map cannot be matched to two buildings which are NEAR in the geo-referenced map.

The main limitation of the new spatial logics is that they require a level of tolerance σ and when using the logics, the value of σ is the same for spatial objects of diﬀerent sizes and types (such as buildings, roads, rivers and lakes). For example, the margin of error used for cities should be larger than that for buildings. Ideally, the value of σ should vary by the size and type of the spatial object being checked. This motivates the development of new spatial logics to reason about the sizes and types of spatial objects, in addition to their relative locations.

In this paper, the theorems were all proved with respect to a metric space. However, models based on a metric space may not be realisable in a 2D Euclidean space, which is more realistic for geospatial data. Suppose there are four points pi, where i {1, 2, 3, 4}. For each point pi, d(pi, pi) = 0. For any pair of them, d(pi, pj) = d(pj, pi) = 1. It is clear that there is a metric space satisfying all the distance constraints, but there is no such 2D Euclidean space. Wolter and Zakharyaschev (2003, 2005) proved that the satisﬁability problem for a ﬁnite set of MS(Q 0) formulas in a 2D Euclidean space R2 is undecidable, whilst its proper fragments may be decidable. We proved that the satisﬁability problem for a ﬁnite set of LNF formulas in a 2D Euclidean space is decidable in PSPACE (Du, 2015), but whether the satisﬁability problem for a ﬁnite set of LNFS/LBPT formulas in a 2D Euclidean space is decidable is still unknown. It also remains open that whether the LNF/LNFS/LBPT calculus is complete for models based on a 2D Euclidean space. If not, a theoretical challenge is to design logics which are complete for 2D Euclidean spaces, and hence provide more accurate debugging of matches than the logics of metric spaces.

Finally, the use of description logic and the new spatial logics may not be able to detect all the wrong matches. For example, for spatial objects X, Y in one dataset and X , Y in the other dataset, if same As(X, X ) is correct, Y is NEAR and to the south of X, and Y

is NEAR and to the north of X , then same As(Y, Y ) is wrong but cannot be detected. To deal with this, we could extend the logics with existing spatial formalisms for reasoning about directional relations (Frank, 1991, 1996; Ligozat, 1998; Balbiani et al., 1999; Goyal & Egenhofer, 2001; Skiadopoulos & Koubarakis, 2004).

Qualitative Spatial Logics for Buffered Geometries

9. Conclusion and Future Work

We presented a series of new qualitative spatial logics LNF, LNFS and LBPT for validating matches between spatial objects, especially in crowd-sourced geospatial data. For models based on a metric space, a sound and complete axiomatisation is provided and corresponding theorems are proved for each logic. The LNF, LNFS and LBPT satisﬁability problems in a metric space are all NP-complete. An LBPT reasoner with an ATMS was implemented and used as a part of Match Maps. Experimental results show that the LBPT reasoner can be used to verify consistency of matches with respect to location information and detect obvious logical errors eﬀectively. As future work, we will investigate whether the LNF/LNFS/LBPT calculus is complete for models based on a 2D Euclidean space and develop new spatial logics (e.g. for reasoning about directional relations and object sizes in addition to distances) to provide more accurate debugging of matches.

Acknowledgments

We would like to thank the anonymous reviewers who provided excellent comments that helped us improve the paper.

Appendix A. Proofs

Lemma 9 If Σ+ is an MCS, then for any pair of individual names a, b occurring in Σ, exactly one of the following cases holds in Σ+:

1. case(a, b) = BPT(a, b) BPT(b, a);

2. case(a, b) = BPT(a, b) BPT(b, a);

3. case(a, b) = BPT(a, b) BPT(b, a);

4. case(a, b) = BPT(a, b) BPT(b, a) NEAR(a, b);

5. case(a, b) = NEAR(a, b) FAR(a, b);

6. case(a, b) = FAR(a, b),

where case(a, b) denotes the formula which holds between a, b in each case.

Proof. For any pair of individual names a, b occurring in Σ+, we have:

(B B 1 N F) (B B 1 N F) (B B 1 N F) (B B 1 N F) (B B 1 N F) (B B 1 N F) (B B 1 N F) (B B 1 N F) ( B B 1 N F) ( B B 1 N F) ( B B 1 N F) ( B B 1 N F) ( B B 1 N F) ( B B 1 N F) ( B B 1 N F) ( B B 1 N F)

where B, B 1, N, F stand for BPT(a, b), BPT(b, a), NEAR(a, b), FAR(a, b) respectively. From Table 3, we have

(B B 1) (B B 1) ( B B 1) ( B B 1 N) ( N F) F.

Du & Alechina

B B 1 N F Prime Implicant Axiom/Fact used 1 1 1 1 Fact 15 1 1 1 0 B B 1 Facts 14, 15, Axiom 3 1 1 0 1 Fact 14, Axiom 3 1 1 0 0 Fact 14, Axiom 3 1 0 1 1 Fact 15 1 0 1 0 B B 1 Facts 14, 15, Axiom 3 1 0 0 1 Fact 14, Axiom 3 1 0 0 0 Fact 14, Axiom 3 0 1 1 1 Fact 15 0 1 1 0 B B 1 Facts 14, 15, Axiom 3 0 1 0 1 Fact 14, Axiom 3 0 1 0 0 Fact 14, Axiom 3 0 0 1 1 Fact 15 0 0 1 0 B B 1 N Fact 15 0 0 0 1 F Facts 14, 15, Axiom 3 0 0 0 0 N F Fact 14, Axiom 3

Table 3: truth table, 1 stands for true , 0 stands for false

Lemma 11 For any distance range g occurring in D(Σ+),

g {{0}, [0, σ], (σ, ), [0, 2σ], (2σ, ), (2σ, 4σ], (4σ, ), [0, )}.

Proof. Suppose p, q are constants and d(p, q) g is in D(Σ+). Let us look at g in all diﬀerent cases:

p = q: by Deﬁnition 13, g = {0}.

p points(a), q points(a), for some individual name a: by Deﬁnition 13, g = [0, ).

p points(a), q points(b), for diﬀerent individual names a, b: by Lemma 9 and Deﬁnition 10, exactly one of following cases holds:

C1 {χ(a, b, [0, σ]), χ(b, a, [0, σ])} B(Σ+) C2 {χ(a, b, [0, σ]), ξ(b, a, (σ, ))} B(Σ+) C3 {ξ(a, b, (σ, )), χ(b, a, [0, σ])} B(Σ+) C4 {ξ(a, b, (σ, )), ξ(b, a, (σ, )), (a, b, [0, 2σ]), (b, a, [0, 2σ])} B(Σ+) C5 { (a, b, (2σ, )), (b, a, (2σ, )), (a, b, [0, 4σ]), (b, a, [0, 4σ])} B(Σ+) C6 { (a, b, (4σ, )), (b, a, (4σ, ))} B(Σ+)

if exactly one of p, q is a witness of χ(a, b, [0, σ]) or χ(b, a, [0, σ]), then by Deﬁnition 13, in the construction process, d(p, q) [0, σ] will be added to

Qualitative Spatial Logics for Buffered Geometries

D(Σ+), and then d(p, q) [0, + ) will be added to D(Σ+). Since [0, σ] [0, + ) = [0, σ], g = [0, σ]. else if p is a witness of χ(b, a, [0, σ]) and q is a witness of χ(a, b, [0, σ]), then by Deﬁnition 13, in the construction process, d(p, q) [0, σ] will be added to D(Σ+), and then d(p, q) [0, σ] will be added to D(Σ+) again, and then d(p, q) [0, + ) will be added to D(Σ+). Since [0, σ] [0, σ] [0, + ) = [0, σ], g = [0, σ]. else, by Deﬁnition 13, g = [0, + ).

if q is a witness of χ(a, b, [0, σ]), then by Deﬁnition 13, in the construction process, d(p, q) [0, σ] will be added to D(Σ+), and then d(p, q) [0, ) will be added to D(Σ+). Since [0, σ] [0, ) = [0, σ], g = [0, σ]. else if q is a witness of ξ(b, a, (σ, )), then by Deﬁnition 13, in the construction process, d(p, q) (σ, ) will be added to D(Σ+), and then d(p, q) [0, ) will be added to D(Σ+). Since (σ, ) [0, ) = (σ, ), g = (σ, ). else, by Deﬁnition 13, g = [0, + ).

if p is a witness of ξ(a, b, (σ, )), then by Deﬁnition 13, in the construction process, d(p, q) (σ, ) will be added to D(Σ+), and then d(p, q) [0, ) will be added to D(Σ+). Since (σ, ) [0, ) = (σ, ), g = (σ, ). else if p is a witness of χ(b, a, [0, σ]), then by Deﬁnition 13, in the construction process, d(p, q) [0, σ] will be added to D(Σ+), and then d(p, q) [0, ) will be added to D(Σ+). Since [0, σ] [0, ) = [0, σ], g = [0, σ]. else, by Deﬁnition 13, g = [0, + ).

if the pair p, q is a witness of (a, b, [0, 2σ]), then by Deﬁnition 13, in the construction process, d(p, q) [0, 2σ] will be added to D(Σ+), and then d(p, q) [0, ) will be added to D(Σ+). Since [0, 2σ] [0, ) = [0, 2σ], g = [0, 2σ]. else if exactly one of p, q is a witness of ξ(a, b, (σ, )) or ξ(b, a, (σ, )), then by Deﬁnition 13, in the construction process, d(p, q) (σ, ) will be added to D(Σ+), and then d(p, q) [0, ) will be added to D(Σ+). Since (σ, ) [0, ) = (σ, ), g = (σ, ). else if p is a witness of ξ(a, b, (σ, )) and q is a witness of ξ(b, a, (σ, )), then by Deﬁnition 13, in the construction process, d(p, q) (σ, ) will be added to D(Σ+), and then d(p, q) (σ, ) will be added to D(Σ+) again, and then d(p, q) [0, ) will be added to D(Σ+). Since (σ, ) (σ, ) [0, ) = (σ, ), g = (σ, ). else, by Deﬁnition 13, g = [0, + ).

if the pair p, q is a witness of (a, b, [0, 4σ]), then by Deﬁnition 13, in the construction process, d(p, q) [0, 4σ] will be added to D(Σ+), and then,

Du & Alechina

d(p, q) (2σ, ) will be added to satisfy the formulas, and then d(p, q) [0, ) will be added to D(Σ+). Since [0, 4σ] (2σ, ) [0, ) = (2σ, 4σ], g = (2σ, 4σ]. else, by Deﬁnition 13, d(p, q) (2σ, ) will be added to satisfy the formulas, then d(p, q) [0, ) will be added to D(Σ+). Since (2σ, ) [0, ) = (2σ, ), g = (2σ, ).

In C6, by Deﬁnition 13, d(p, q) (4σ, ) will be added, then d(p, q) [0, ) will be added to D(Σ+). Since (4σ, ) [0, ) = (4σ, ), g = (4σ, ).

Therefore, g {{0}, [0, σ], (σ, ), [0, 2σ], (2σ, ), (2σ, 4σ], (4σ, ), [0, )}.

Lemma 16 (Calculation of Composition) If (m, n), (s, t), (m, ), (s, ), {l}, {r} are non-negative non-empty intervals, H1, H2, H are non-negative intervals, then the following calculation rules hold:

1. {l} {r} = [l r, l + r], if l r;

2. {l} (s, t) = (s l, t + l), if s l;

3. {l} (s, t) = [0, t + l), if l (s, t);

4. {l} (s, t) = (l t, t + l), if t l;

5. {l} (s, + ) = (s l, + ), if s l;

6. {l} (s, + ) = [0, + ), if s < l;

7. (m, n) (s, t) = (s n, t + n), if s n;

8. (m, n) (s, t) = [0, t + n), if (m, n) (s, t) = ;

9. (m, n) (s, + ) = (s n, + ), if s n;

10. (m, n) (s, + ) = [0, + ), if s < n;

11. (m, + ) (s, + ) = [0, + );

13. H1 H2 = H2 H1;

14. (H1 H2) H = (H1 H) (H2 H);

15. (S k Hk) H = S k(Hk H), where k N>0;

16. (H1 H2) H = (H1 H) (H2 H), if (H1 H2) = ;

17. (H1 H2) H = H1 (H2 H).

Qualitative Spatial Logics for Buffered Geometries

Proof.[for Rule 15] Suppose d (S k Hk) H. By Lemma 15, there exist d1 S k Hk and d2 H such that d {d1} {d2}. Since d1 S k Hk, then there exists a k N>0 such that d1 Hk. Since d1 Hk for some k, d2 H, then by Deﬁnition 7, d Hk H, for some k. Therefore d S k(Hk H). Suppose d S k(Hk H). Then, there exists a k N>0 such that d Hk H. Since Hk S k Hk, by Deﬁnition 7, d (S k Hk) H.

Proof.[for Rule 16] Suppose H1 H2 = . Then, Hi = , i {1, 2}. Since H1, H2, H are non-negative intervals, by the intersection rules and Deﬁnition 7, (H1 H2) H and (H1 H) (H2 H) are non-negative intervals. Let L = (H1 H2) H, R = (H1 H) (H2 H). If H = , then by Rule 12, L = R = ; otherwise, we show L = R by cases:

H1 H2 or H2 H1: When H1 H2, then Deﬁnition 7, H1 H H2 H. L = H2 H = R. When H2 H1, similarly, L = H1 H = R.

H1 H2 and H2 H1: Without loss of generality, let us suppose glb(H1) glb(H2) lub(H1) lub(H2). Then, glb(H1 H2) = glb(H2), lub(H1 H2) = lub(H1). To prove L = R, it is suﬃcient to show the following properties hold:

1. lub(L) = lub(R);

2. glb(L) = glb(R);

3. lub(L) L iﬀlub(R) R;

4. glb(L) L iﬀglb(R) R.

By Rules 1-14 and intersection rules, lub(L) = lub(H1 H2) + lub(H) = lub(H1) + lub(H). lub(R) = min(lub(H1) + lub(H), lub(H2) + lub(H)) = lub(H1) + lub(H). Thus, lub(L) = lub(R) (Property 1 holds). If lub(H1) H1 and lub(H) H, then by Rules 1-14 and intersection rules, lub(L) L and lub(R) R; otherwise, lub(L) L and lub(R) R. Thus, Property 3 holds. We prove Property 2 and Property 4 by cases:

H H1 = and H H2 = :

lub(H) glb(H1): glb(L) = glb(H1 H2) lub(H) = glb(H2) lub(H). glb(R) = max(glb(H1) lub(H), glb(H2) lub(H)) = glb(H2) lub(H). Thus, glb(L) = glb(R) (Property 2 holds). If glb(H2) H2 and lub(H) H, then by Rules 1-14 and intersection rules, glb(L) L and glb(R) R; otherwise, glb(L) L and glb(R) R. Thus, Property 4 holds. glb(H) lub(H2): glb(L) = glb(H) lub(H1 H2) = glb(H) lub(H1). glb(R) = max(glb(H) lub(H1), glb(H) lub(H2)) = glb(H) lub(H1). Similar to the case above, it is clear that Property 2 and Property 4 hold.

H H1 = and H H2 = : then, H (H1 H2) = . glb(L) = glb(H1 H2) lub(H) = glb(H2) lub(H).

Du & Alechina

glb(R) = max(0, glb(H2) lub(H)) = glb(H2) lub(H). Similar to the cases above, it is clear that Property 2 and Property 4 hold. H H1 = and H H2 = : then, H (H1 H2) = . glb(L) = glb(H) lub(H1 H2) = glb(H) lub(H1). glb(R) = max(glb(H) lub(H1), 0) = glb(H) lub(H1). Similar to the cases above, it is clear that Property 2 and Property 4 hold. H H1 = and H H2 = : since H1, H2, H are intervals and H1 H2 = , H (H1 H2) = . glb(L) = 0. glb(R) = max(0, 0) = 0. By Rules 1-14, glb(L) L and glb(R) R. It is clear that Property 2 and Property 4 hold.

In every case, Properties 1-4 hold.

Therefore, L = R.

Proof.[for Rule 17] Let L = (H1 H2) H, R = H1 (H2 H). By Deﬁnition 7, L = (S d1 H1,d2 H2{d1} {d2}) H. By Rule 15, L = S d1 H1,d2 H2(({d1} {d2}) H). By Rule 13, ({d1} {d2}) H = H ({d1} {d2}). By Rule 15, H ({d1} {d2}) = S d H({d} ({d1} {d2})). By Rule 13, L = S d1 H1,d2 H2,d H(({d1} {d2}) {d}). Similarly, R = S d1 H1,d2 H2,d H({d1} ({d2} {d})). To prove L = R, it is suﬃcient to show

({d1} {d2}) {d} = {d1} ({d2} {d}).

Let l = ({d1} {d2}) {d}, then l = [|d1 d2|, d1 + d2] {d}; r = {d1} ({d2} {d}), then r = {d1} [|d2 d|, d2 + d]. We prove l = r by cases:

d [|d1 d2|, d1 + d2]: By Deﬁnition 7, l = [0, d1 + d2 + d]. d1 + d2 d, d2 + d d1, d1 + d d2. Thus, d1 [|d2 d|, d2 + d]. By Deﬁnition 7, r = [0, d1 + d2 + d].

d [|d1 d2|, d1 + d2]:

d > d1 + d2: By Deﬁnition 7, l = [d d1 d2, d1 + d2 + d]. d1 < d d2 = |d2 d|. By Deﬁnition 7, r = [d d2 d1, d1 + d2 + d]. d < |d1 d2|: By Deﬁnition 7, l = [|d1 d2| d, d1 + d2 + d]. d1 d2: d < d1 d2, this is, d1 > d2 + d. By Deﬁnition 7, r = [d1 d2 d, d1 + d2 + d]. d1 < d2: d < d2 d1, this is, d1 < d2 d. By Deﬁnition 7, r = [d2 d d1, d1 + d2 + d].

In each case, l = r. Therefore, L = R.

Qualitative Spatial Logics for Buffered Geometries

Appendix B. Alternative Proof of the Path-Consistency Lemma

In this appendix, we would like to provide a sketch of an alternative proof idea of the Path-Consistency Lemma, since it may appeal to some of the readers more than the proof presented in Section 5.3. The alternative proof uses Lemma 44.

Lemma 44 If a distance constraint d(p, q) h is in DS(Σ+) and h = , then there exist d(p, q) m1 and d(p, q) m2 in DS(Σ+) such that h = m1 m2, m1 and m2 are both identity or deﬁnable intervals.

Proof. By Lemma 31, d(p, q) h is left-deﬁnable and right-deﬁnable. By Deﬁnition 16, there exists a sequence of distance constraints d(pi, pi+1) hi (p1 = p, pn = q, 0 < i < n) in D(Σ+), such that for m1 = h1 ... hn 1, h m1, h and m1 have the same greatest lower bound (both value and openness). By Deﬁnition 15, d(p, q) m1 is in DS(Σ+). By Lemma 21 and Deﬁnition 14, m1 is an identity or deﬁnable interval. Similarly, by Deﬁnition 17, there exists an m2 such that h m2, h and m2 have the same least upper bound (both value and openness), d(p, q) m2 is in DS(Σ+), m2 is an identity or deﬁnable interval. By intersection rules, h = m1 m2.

Proof.[sketch of the alternative proof of the Path-Consistency Lemma] Suppose D(Σ+) is not path-consistent. Then there exist d(p, q) g1 and d(p, q) g2 in DS(Σ+), g1 = , g2 = , and g1 g2 = . By Lemma 44, there exist d(p, q) m1 and d(p, q) s1 in DS(Σ+) such that g1 = m1 s1, m1 and s1 are both identity or deﬁnable intervals. Similarly, we have g2 = m2 s2, where m2 and s2 are both identity or deﬁnable intervals. g1 g2 = holds iﬀone of the following holds: m1 m2 = , m1 s2 = , s1 m2 = , s1 s2 = . Without loss of generality, let us suppose m1 m2 = . By Lemma 35, m1 m2 = iﬀ(m1 m2) {0} = . Let m = m1 m2. Then m is an identity or deﬁnable interval. Since d(p, q) m1 and d(p, q) m2 are in DS(Σ+), m1 = , m2 = , then by Lemma 31, d(p, q) m1 and d(p, q) m2 are left-deﬁnable and right-deﬁnable. Since d(p, q) m2, we have d(q, p) m2. By Lemma 29, d(p, p) m is left-deﬁnable and right-deﬁnable. The rest of the proof is almost the same as the proof of the Path-Consistency Lemma (starting from By Lemma 23, glb(h ) {0, σ, 2σ, 3σ, 4σ} ) presented in Section 5.3. We discuss diﬀerent ways to obtain m given its greatest lower bound (the role of m is similar to h ) and check whether can be derived in every valid case.

Appendix C. Consequences of the Path-Consistency Lemma

In this appendix, we state explicitly some implications of the Path-Consistency Lemma.

Lemma 45 Let Σ be a ﬁnite consistent set of formulas. If a distance constraint d(p, q) h is in DS(Σ+), then h = .

Proof. Follows immediately from the proof of the Path-Consistency Lemma.

Lemma 46 Let Σ be a ﬁnite consistent set of formulas. If a distance constraint d(p, p) h is in DS(Σ+), then 0 h.

Du & Alechina

Proof. Suppose a distance constraint d(p, p) h is in DS(Σ+) and 0 h. By Deﬁnition 13 and Deﬁnition 15, d(p, p) {0} is in D(Σ+). By Deﬁnition 15, d(p, p) (h {0}) = . This contradicts the fact that d(p, p) is not in DS(Σ+) (by Lemma 45). Therefore, 0 h.

Aiello, M., Pratt-Hartmann, I., & van Benthem, J. (Eds.). (2007). Handbook of Spatial Logics. Springer.

Allen, J. F. (1983). Maintaining Knowledge about Temporal Intervals. Communications of the ACM, 26(11), 832 843.

Balbiani, P., Condotta, J., & del Cerro, L. F. (1999). A New Tractable Subclass of the Rectangle Algebra. In Proceedings of the 16th International Joint Conference on Artiﬁcal Intelligence, pp. 442 447.

Bennett, B. (1996). The Application of Qualitative Spatial Reasoning to GIS. In Proceedings of the 1st International Conference on Geo Computation, Vol. I, pp. 44 47.

Bennett, B., Cohn, A. G., & Isli, A. (1997). A Logical Approach to Incorporating Qualitative Spatial Reasoning into GIS (Extended Abstract). In Proceedings of the 3rd International Conference on Spatial Information Theory, Vol. 1329 of Lecture Notes in Computer Science, pp. 503 504. Springer.

Chen, J., Cohn, A. G., Liu, D., Wang, S., Ou Yang, J., & Yu, Q. (2015). A survey of qualitative spatial representations. The Knowledge Engineering Review, 30(1), 106 136.

Clementini, E., & Felice, P. D. (1996). An algebraic model for spatial objects with indeterminate boundaries. In Proceedings of the GISDATA specialist meeting on Geographic Objects with Undeterminate Boundaries, pp. 155 169.

Clementini, E., & Felice, P. D. (1997). Approximate Topological Relations. International Journal of Approximate Reasoning, 16(2), 173 204.

Clementini, E., Felice, P. D., & Hern andez, D. (1997). Qualitative Representation of Positional Information. Artiﬁcial Intelligence, 95(2), 317 356.

Cohn, A. G., & Gotts, N. M. (1996a). Representing Spatial Vagueness: A Mereological Approach. In Proceedings of the 5th International Conference on Principles of Knowledge Representation and Reasoning, pp. 230 241.

Cohn, A. G., & Gotts, N. M. (1996b). The Egg-Yolk Representation of Regions with Indeterminate Boundaries. In Proceedings of the GISDATA Specialist Meeting on Geographical Objects with Undetermined Boundaries, pp. 171 187.

Cohn, A. G., & Renz, J. (2008). Qualitative Spatial Representation and Reasoning. In Handbook of Knowledge Representation, pp. 551 596. Elsevier.

de Kleer, J. (1986). An assumption-based TMS. Artiﬁcial Intelligence, 28(2), 127 162.

Du, H. (2015). Matching Disparate Geospatial Datasets and Validating Matches using Spatial Logic. Ph.D. thesis, School of Computer Science, University of Nottingham, UK.

Qualitative Spatial Logics for Buffered Geometries

Du, H., & Alechina, N. (2014a). A Logic of Part and Whole for Buﬀered Geometries. In Proceedings of the 21st European Conference on Artiﬁcial Intelligence, pp. 997 998.

Du, H., & Alechina, N. (2014b). A Logic of Part and Whole for Buﬀered Geometries. In Proceedings of the 7th European Starting AI Researcher Symposium, pp. 91 100.

Du, H., Alechina, N., Hart, G., & Jackson, M. (2015). A Tool for Matching Crowd-sourced and Authoritative Geospatial Data. In Proceedings of the International Conference on Military Communications and Information Systems, pp. 1 8. IEEE.

Du, H., Alechina, N., Jackson, M., & Hart, G. (2016). A Method for Matching Crowd-sourced and Authoritative Geospatial Data. Transactions in GIS. http://dx.doi.org/10.1111/tgis.12210.

Du, H., Alechina, N., Stock, K., & Jackson, M. (2013). The Logic of NEAR and FAR. In Proceedings of the 11th International Conference on Spatial Information Theory, Vol. 8116 of Lecture Notes in Computer Science, pp. 475 494. Springer.

Du, H., Nguyen, H., Alechina, N., Logan, B., Jackson, M., & Goodwin, J. (2015). Using Qualitative Spatial Logic for Validating Crowd-Sourced Geospatial Data. In Proceedings of the 29th AAAI Conference on Artiﬁcial Intelligence (the 27th Conference on Innovative Applications of Artiﬁcial Intelligence), pp. 3948 3953.

Egenhofer, M. J. (1997). Query processing in spatial-query-by-sketch. Journal of Visual Languages and Computing, 8(4), 403 424.

Egenhofer, M. J., & Franzosa, R. D. (1991). Point Set Topological Spatial Relations. International Journal of Geographical Information Systems, 5(2), 161 174.

Egenhofer, M. J., & Herring, J. R. (1991). Categorizing Binary Topological Relations Between Regions, Lines, and Points in Geographic Databases. Tech. rep., University of Maine.

Fine, K. (1975). Vagueness, truth and logic. Synth ese, 30, 263 300.

Frank, A. U. (1991). Qualitative Spatial Reasoning with Cardinal Directions. In Proceedings of the 7th Austrian Conference on Artiﬁcial Intelligence, pp. 157 167.

Frank, A. U. (1996). Qualitative Spatial Reasoning: Cardinal Directions as an Example. International Journal of Geographical Information Science, 10(3), 269 290.

Goyal, R. K., & Egenhofer, M. J. (2001). Similarity of Cardinal Directions. In Jensen, C. S., Schneider, M., Seeger, B., & Tsotras, V. J. (Eds.), Advances in Spatial and Temporal Databases, Vol. 2121 of Lecture Notes in Computer Science, pp. 36 55. Springer.

Guesgen, H. W., & Albrecht, J. (2000). Imprecise reasoning in geographic information systems. Fuzzy Sets and Systems, 113(1), 121 131.

Haklay, M. (2010). How good is volunteered geographical information? A comparative study of Open Street Map and Ordnance Survey datasets. Environment and Planning B: Planning and Design, 37(4), 682 703.

ISO Technical Committee 211 (2003). ISO 19107:2003 Geographic information Spatial schema. Tech. rep., International Organization for Standardization (TC 211).

Du & Alechina

Jackson, M., Rahemtulla, H., & Morley, J. (2010). The synergistic use of authenticated and crowd-Sourced data for emergency response. In Proceedings of the 2nd International Workshop on validation of Geo Information products for crisis management, pp. 91 99.

Kopczynski, M. (2006). Eﬃcient spatial queries with sketches. In Proceedings of the ISPRS Technical Commission II Symposium, pp. 19 24.

Kutz, O. (2007). Notes on Logics of Metric Spaces. Studia Logica, 85(1), 75 104.

Kutz, O., Sturm, H., Suzuki, N., Wolter, F., & Zakharyaschev, M. (2002). Axiomatizing Distance Logics. Journal of Applied Non-Classical Logics, 12(3-4), 425 440.

Kutz, O., Wolter, F., Sturm, H., Suzuki, N., & Zakharyaschev, M. (2003). Logics of metric spaces. ACM Transactions on Computational Logic, 4(2), 260 294.

Lehmann, F., & Cohn, A. G. (1994). The EGG/YOLK Reliability Hierarchy: Semantic Data Integration Using Sorts with Prototypes. In Proceedings of the 3rd International Conference on Information and Knowledge Management, pp. 272 279.

Li, S., Liu, W., & Wang, S. (2013). Qualitative constraint satisfaction problems: An extended framework with landmarks. Artiﬁcial Intelligence, 201, 32 58.

Li, S., Long, Z., Liu, W., Duckham, M., & Both, A. (2015). On redundant topological constraints. Artiﬁcial Intelligence, 225, 51 76.

Ligozat, G. . (1998). Reasoning about Cardinal Directions. Journal of Visual Languages & Computing, 9(1), 23 44.

Lutz, C., & Milicic, M. (2007). A Tableau Algorithm for Description Logics with Concrete Domains and General TBoxes. Journal of Automated Reasoning, 38(1-3), 227 259.

Mackworth, A. K., & Freuder, E. C. (1985). The Complexity of Some Polynomial Network Consistency Algorithms for Constraint Satisfaction Problems. Artiﬁcial Intelligence, 25(1), 65 74.

Mallenby, D. (2007). Grounding a Geographic Ontology on Geographic Data. In AAAI Spring Symposium - Logical Formalizations of Commonsense Reasoning, pp. 101 106.

Mallenby, D., & Bennett, B. (2007). Applying Spatial Reasoning to Topographical Data with a Grounded Ontology. In Proceedings of the 2nd International Conference Geo Spatial Semantics, No. 4853 in Lecture Notes in Computer Science, pp. 210 227. Springer.

Moratz, R., Renz, J., & Wolter, D. (2000). Qualitative Spatial Reasoning about Line Segments. In Proceedings of the 14th European Conference on Artiﬁcial Intelligence, pp. 234 238.

Moratz, R., & Wallgr un, J. O. (2012). Spatial reasoning with augmented points: Extending cardinal directions with local distances. Journal of Spatial Information Science, 5(1), 1 30.

Open Street Map (2012). The Free Wiki World Map. http://www.openstreetmap.org.

Ordnance Survey (2012). Ordnance Survey. http://www.ordnancesurvey.co.uk.

Pawlak, Z., Polkowski, L., & Skowron, A. (2007). Rough Set Theory. In Wiley Encyclopedia of Computer Science and Engineering. John Wiley & Sons, Inc.

Qualitative Spatial Logics for Buffered Geometries

Randell, D. A., Cui, Z., & Cohn, A. G. (1992). A Spatial Logic based on Regions and Connection. In Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning, pp. 165 176.

Renz, J., & Nebel, B. (2007). Qualitative Spatial Reasoning Using Constraint Calculi. In Aiello, M., Pratt-Hartmann, I., & van Benthem, J. (Eds.), Handbook of Spatial Logics, pp. 161 215. Springer.

Roy, A. J., & Stell, J. G. (2001). Spatial Relations between Indeterminate Regions. International Journal of Approximate Reasoning, 27(3), 205 234.

Schockaert, S., Cock, M. D., Cornelis, C., & Kerre, E. E. (2008a). Fuzzy region connection calculus: An interpretation based on closeness. International Journal of Approximate Reasoning, 48(1), 332 347.

Schockaert, S., Cock, M. D., Cornelis, C., & Kerre, E. E. (2008b). Fuzzy region connection calculus: Representing vague topological information. International Journal of Approximate Reasoning, 48(1), 314 331.

Schockaert, S., Cock, M. D., & Kerre, E. E. (2009). Spatial reasoning in a fuzzy region connection calculus. Artiﬁcial Intelligence, 173(2), 258 298.

Sirin, E., Parsia, B., Grau, B. C., Kalyanpur, A., & Katz, Y. (2007). Pellet: A practical OWL-DL reasoner. Journal of Web Semantics, 5(2), 51 53.

Skiadopoulos, S., & Koubarakis, M. (2004). Composing cardinal direction relations. Artiﬁcial Intelligence, 152(2), 143 171.

Smith, N. J. (2008). Vagueness and Degrees of Truth. Oxford University Press.

Stocker, M., & Sirin, E. (2009). Pellet Spatial: A Hybrid RCC-8 and RDF/OWL Reasoning and Query Engine. In Proceedings of the 5th International Workshop on OWL: Experiences and Directions.

Sturm, H., Suzuki, N., Wolter, F., & Zakharyaschev, M. (2000). Semi-qualitative Reasoning about Distances: A Preliminary Report. In Proceedings of the Logics in Artiﬁcial Intelligence, European Workshop, JELIA, pp. 37 56.

van Beek, P. (1992). Reasoning About Qualitative Temporal Information. Artiﬁcial Intelligence, 58(1-3), 297 326.

Wallgr un, J. O., Wolter, D., & Richter, K. (2010). Qualitative matching of spatial information. In Proceedings of the 18th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, pp. 300 309.

Wolter, F., & Zakharyaschev, M. (2003). Reasoning about Distances. In Proceedings of the 18th International Joint Conference on Artiﬁcial Intelligence, pp. 1275 1282.

Wolter, F., & Zakharyaschev, M. (2005). A logic for metric and topology. Journal of Symbolic Logic, 70(3), 795 828.

Zadeh, L. A. (1975). Fuzzy logic and approximate reasoning. Synthese, 30(3-4), 407 428.

Zimmermann, K. (1995). Measuring without Measures: The Delta-Calculus. In Proceedings of the 2nd International Conference on Spatial Information Theory, pp. 59 67.