# on_the_optimal_efficiency_of_costalgebraic_a__b89d1c14.pdf

The Thirty-Third AAAI Conference on Artiﬁcial Intelligence (AAAI-19)

On the Optimal Efﬁciency of Cost-Algebraic A*

Robert C. Holte Computing Science Dept. University of Alberta Edmonton, Canada T6G 2E8 (rholte@ualberta.ca)

Sandra Zilles Computer Science Dept. University of Regina Regina, Canada S4S 0A2 (zilles@cs.uregina.ca)

Edelkamp et al. (2005) proved that A*, given an admissible heuristic, is guaranteed to return an optimal solution in any cost algebra, not just in the traditional shortest path setting. In this paper, we investigate cost-algebraic A* s optimal efﬁciency: in the cost-algebraic setting, under what conditions is A* guaranteed to expand the fewest possible states? In the traditional setting, this question was examined in detail by Dechter & Pearl (1985). They identiﬁed ﬁve different situations in which A* was optimally efﬁcient. We show that three of them continue to hold in the cost-algebraic setting, but that one does not. We also show that one of them is false, it does not hold even in the traditional setting. We introduce an alternative that does hold in the cost-algebraic setting. Finally, we show that a well-known result due to Nilsson does not hold in the general cost-algebraic setting but does hold in a slightly less general setting.

1 Introduction In the traditional shortest path problem, the cost of an edge can be any non-negative real number, the cost of a path is the sum of its edge costs, and the aim is to ﬁnd a path from the start state to a goal state whose cost is smallest according to the usual way of ordering the real numbers. This particular combination of permissible values, cost-aggregating operator, and value ordering is an example of a cost algebra. There are many other ways of deﬁning the set of permissible values, cost-aggregating operator, and value ordering, and some of these alternatives are of practical importance, in network routing for example (Sobrinho 2002). An example of a shortest path problem based on a different cost algebra is the widest path problem. A robot travels to a goal location through a network of corridors, each of which has a width. The aim is to ﬁnd a path whose narrowest corridor is widest, so as to minimize the risk of the robot scraping against a wall. The values in this problem are the positive real values up to some maximum possible width, the cost-aggregating operator is min (because it is the narrowest corridor in a path that determines the path s value), and the value ordering is the opposite of the ordering used for the normal shortest path problem (because this is a maximization problem, not a minimization problem).

Copyright c 2019, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

A* (Hart, Nilsson, and Raphael 1968) is a fundamental algorithm for solving the traditional shortest path problem, so the question naturally arises, can it also be used to solve shortest path problems in the general cost-algebraic setting (e.g. the widest path problem)? Edelkamp et al. (2005) showed that the answer is yes. Just as in the traditional setting, A* with an admissible heuristic is guaranteed to return an optimal solution in the general cost-algebraic setting. In this paper our primary focus is on the optimal efﬁciency of A*: under what conditions is A* guaranteed to expand the fewest possible states? In the traditional setting, this question was examined in detail by Dechter & Pearl (1985). They identiﬁed ﬁve different situations in which A* was optimally efﬁcient ( 0optimal in their terminology), which we refer to as results R1 through R5 (see Section 5 for details). We show that R1, R2, and R5 continue to hold in the cost-algebraic setting, but that R4 does not. We also show that R3 is false, it does not hold even in the traditional setting. We introduce an alternative to R3 that does hold in the cost-algebraic setting. Finally, we consider the following result in the traditional setting (Nilsson 1980): if heuristics h1 and h2 are admissible and h1(s) > h2(s) on every non-goal state s then every state expanded by A* using h1 is also expanded by A* using h2. We show that this does not hold in the general cost-algebraic setting but does hold in a slightly less general setting. Proofs of some lemmas are omitted due to space limitations. A technical report with all the proofs is available from the authors on request.

2 Cost Algebra Deﬁnitions The deﬁnitions in this section are copied from (Edelkamp, Jabbar, and Lluch Lafuente 2005) with a few minor changes. Informally, Ωis the set of possible edge and path costs, 1 is the cost of the empty path, is the operator for computing a path s cost from the costs of its edges, and is the ordering on costs. Those four components deﬁne a cost algebra. Isotonicity is a key deﬁning property of a cost algebra, almost all our theoretical results rely on it.

Deﬁnition 1. Let Ωbe a set and : Ω Ω Ωa binary operator. A monoid is a triple Ω, , 1 such that 1 Ωand for all a, b, c Ω:

a (b c) = (a b) c

1 a = a 1 = a (1 is the identity element) Deﬁnition 2. If Ω, , 1 is a monoid and a1, . . . , an Ωn is a sequence of length n 0, then

i=1 ai = 1, if n = 0 a1 a2 an, otherwise.

Deﬁnition 3. If Ωis a set and is a total order on Ωthen a b denotes (a b) (a = b), and a b and a b are alternative ways of writing b a and b a respectively. Deﬁnition 4. A monoid Ω, , 1 with total order on Ω is isotone iff a b implies both (a c) (b c) and (c a) (c b) for all a, b, c Ω. In cost-algebraic search, where the elements of Ωare path costs, isotonicity is a very natural property. For two paths Pa and Pb costing a and b, respectively, with a b, isotonicity requires that appending a path of cost c to Pa produces a path whose total cost is less than or equal to ( ) the total cost of the path produced by appending a path of cost c to Pb. This is closely related to the order preserving property deﬁned by Dechter & Pearl. Deﬁnition 5. A cost algebra is a 4-tuple Ω, , , 1 such that Ω, , 1 is a monoid is a total order on Ω 1 a for all a Ω Ω, , 1 with is isotone.

3 Search-Related Deﬁnitions Many A* studies, including Dechter & Pearl s, allow a state space to contain an inﬁnite number of states but require edge costs to be bounded away from zero. This requirement is used to guarantee that A* terminates after a ﬁnite number of iterations but does not play a role in Dechter & Pearl s optimal efﬁciency results. We take an alternative approach, allowing edge costs to be 0 (1 in the algebra) but restricting the state space to be ﬁnite in order to guarantee A* s termination. For a ﬁnite set of states S and cost algebra Ω, , , 1 , a problem P is a triple start, goal, σ , where start, goal S and σ is a successor function on S.1 A successor function maps each s S to a set of pairs {(si, ω(s, si))} where si S and ω : S S Ωis a cost function specifying the cost, ω(s, t), of transitioning directly from state s to state t. If σ(s) contains a pair of the form (t, x), we write t σ(s), t is called a successor of s, and we say there is an edge from s to t whose cost is x. To identify the problem P with which a successor function is associated we will use P as a subscript, i.e. σP. Σ denotes the set of all possible successor functions for a given state set and cost algebra. For a given σ Σ, a ﬁnite sequence s0, . . . , sn of states in S is a path if si σ(si 1) for i [1, n]. If P = s0, . . . , sn is a path then its cost, ω(P), is

1This deﬁnition mandates that a problem have just one goal state. Dechter & Pearl allow multiple goal states, but this difference does not materially affect any of our analysis.

Qn i=1 ω(si 1, si). Path P from state s to state t is optimal if ω(P) ω(P ) for all paths P from s to t. If there exists a path from state s to state t the distance from s to t, d(s, t), is the cost of an optimal path from s to t. If there is no path from s to t then d(s, t) = , where is a special value ( / Ω) and and are extended as follows:

a for all a Ω

a = a = = for all a Ω.

When we wish to emphasize the speciﬁc σ on which distances (and related functions such as ω) depend we will add σ as a subscript, as in dσ(s, t), ωσ(P), etc. Problem start, goal, σ is solvable if there exists a path from start to goal. In this paper we are exclusively concerned with solvable problems, so we will henceforth use the term problem to mean solvable problem . A solution for problem P is a path from start to goal, and C P = d(start, goal) is the cost of an optimal solution (written as C when P is obvious from context). A search algorithm succeeds on problem P if it returns C P when given P as input, and succeeds on a set of problems P if it succeeds on every problem in P. A heuristic is a function h : S Ω { } with h(goal) = 1. We will write A h to denote A* using h as its heuristic.

Deﬁnition 6. For heuristic h, goal S, and σ Σ, the triple h, goal, σ is admissible iff h(s) dσ(s, goal) for all s S and is consistent2 iff h(s) ωσ(s, t) h(t) for all s S and t σ(s).

Deﬁnition 7. For heuristic h, PAD(h) is the set of problems start, goal, σ such that h, goal, σ is admissible, and PCONS(h) is the set of problems start, goal, σ such that h, goal, σ is consistent.

PAD(h) and PCONS(h) correspond to Dechter & Pearl s IAD and ICONS, respectively.

Deﬁnition 8. Let h be any heuristic and P a problem. Then path P = s0, . . . , sn is strictly C -bounded iff f P (si) C for all i [0, n], where f P (si) = g P (si) h(si), and

j=0 ω(sj, sj+1).

State s is surely expanded by A* (abbreviated s.e.) iff there is a path from start to s that is strictly C -bounded.

Deﬁnition 9. Given any heuristic h, problem start, goal, σ is non-pathological iff there exists an optimal solution P = s0, . . . , sn (s0 = start, sn = goal) such that f P (si) C for i [0, n 1].

2Historically, this is the deﬁnition of a heuristic being monotone , consistency required h(s) dσ(s, t) h(t) for all s, t S. Pearl (1984) was the ﬁrst to note that the two deﬁnitions are equivalent in the traditional numerical, additive setting. This equivalence also holds in the cost-algebraic setting.

4 Fundamental A* Results

In this section we prove that A*, in the cost-algebraic setting, always expands all s.e. states (Lemma 2) and, if the problem being solved is non-pathological, it expands only the s.e. states (Lemma 3). Both results use the following generalization of Lemma 1 by Hart et al. (1968). Our proof closely follows theirs.

Lemma 1. The following holds at the beginning of every iteration prior to A* s termination: if state s is not closed, then for every path P from start to s there exists a state s P that is in Open with g(s ) g P (s ).

Proof: There are two cases to consider. Case 1: start is not closed. In this case the lemma follows with start playing the role of s for all states s and paths P from start to s. Case 2: start is closed. Let s be any state that is not closed, P any path from start to s, and the set of all states si P that are closed with g(si) g P (si). is not empty because start . Let s be the state in with the largest index in P, i.e. all other states in precede s in P. Let t be the successor of s in P (t exists because s = s since the latter is not in ). We will now show that t has the properties required of s by the lemma. At the time s became closed with its present g-value, g(s ), a path to t was created with a cost of g(s ) ω(s , t). Because of isotonicity and the fact that g(s ) g P (s ), we have g(s ) ω(s , t) g P (s ) ω(s , t) = g P (t). The current g-value of t, g(t), cannot be larger than g(s ) ω(s , t) (because g-values can only decrease as A* proceeds) and therefore g(t) g P (t). Because t has been generated (by the expansion of s ) it is either open or closed. We will now show it cannot be closed, and therefore must be open. t cannot be closed with g(t) g P (t) at the present time because if it were, it would be in , contradicting the fact that s has the largest index of the states in . t could have been in Closed with g(t) g P (t) prior to s being closed with its present g-value, but A* would have removed t from Closed at that time because a lower cost path to t was generated. Therefore at the present time, t must be open with g(t) g P (t), so s = t establishes the lemma.

Lemma 2. Let h be a heuristic and P a problem. If state s is s.e. then A h will expand it.

Proof: Because s is s.e. there exists a strictly C -bounded path P from start to s, i.e. a path such that f P (si) C for all si P. By Lemma 1, if s is not closed then there exists an open state s P with g(s ) g P (s ). Because of isotonicity, this implies g(s ) h(s ) g P (s ) h(s ), i.e. that f(s ) f P (s ). We have f P (s ) C and, if goal is on Open, C f(goal), so that f P (s ) f(goal). Since A h terminates when f(goal) is the smallest ( ) f-value among the states in Open, it will not terminate until all the states on P, including s, have been expanded.

Lemma 3. Let h be a heuristic and P a non-pathological problem. If A h breaks ties in favour of goal then it will return an optimal solution for P and will only expand s.e. states.

5 Dechter & Pearl s Optimal Efﬁciency Results Dechter & Pearl (1985) give four deﬁnitions of optimality . Of the most practical interest is what they called 0-optimal: Deﬁnition 10. Let P be a set of problems and A a set of algorithms. Then A* is 0-optimal over A relative to P iff for every P P and A A the set of nodes expanded by A* in solving P, no matter what tie-breaking rule A* uses as long as ties are broken in favour of goal, is a subset of the set of nodes expanded by A in solving P. We will focus on the situations in which Dechter & Pearl proved that A* is 0-optimal and use the phrase optimally efﬁcient to mean 0-optimal. Dechter & Pearl analyzed 12 different situations: problems could be drawn from PAD(h) or PCONS(h), problems could be pathological or non-pathological, and the algorithms to which A* is compared could be one of three types: ad (admissible, i.e. guaranteed to succeed on PAD(h)), gc (deﬁned in Section 6 below), or bf (what we call BF*(F) in Section 7 below). The following are all the combinations of problemand algorithm-types they analyzed for which A* is 0-optimal (see their Figure 9): (R1) Non-pathological P PCONS(h), ad algorithms (their Theorem 8). Their proof, with only minor changes in a few details, is valid in the cost-algebraic setting. We present this in Appendix A. (R2) Non-pathological P PCONS(h), gc algorithms. Follows from R1 because every gc algorithm is ad. (R3) Non-pathological P PAD(h), gc algorithms (their Corollary 5). In Section 6 we show that this result is false, even in the traditional setting. In Section 6.1 we give an alternative result that maintains some of the spirit of R3. (R4) Non-pathological P PAD(h), bf algorithms (their Corollary 7). In Section 7 we give an example showing that this result does not hold, in general, in the cost-algebraic setting. (R5) Non-pathological P PCONS(h), bf algorithms. For Dechter & Pearl this is a special case of R4, so they did not give a separate proof. However, their proof of R4, with only minor changes in a few details, is valid in the cost-algebraic setting when restricted to PCONS(h), as we will prove in Section 7.

6 Result R3 (Dechter & Pearl s Corollary 5) Dechter & Pearl s Corollary 5 deals with an important type of situation an admissible, but not necessarily consistent, heuristic is available for the given problem but only considers a particular type of search algorithm, called a globally compatible (gc) algorithm. They deﬁne an algorithm to be gc if it returns an optimal solution to a problem whenever A* does, even if the heuristic used by A* is inadmissible. In our notation, Dechter & Pearl s Corollary 5 is this: For any heuristic h, non-pathological problem P PAD(h), and gc algorithm A, if A h breaks ties in favour of goal then any state expanded by A h in solving P must also be expanded by A in solving P.

Since A h breaking ties in favour of goal will only expand s.e. states on a non-pathological problem (Lemma 3), to show that their Corollary 5 is false we need to give a gc algorithm and a non-pathological problem with an admissible heuristic that the gc algorithm solves optimally without expanding some s.e. state. The proof that our algorithm is gc uses the following quantities.

Deﬁnition 11. For path P originating at start deﬁne Qf(P) = max{f P (s)|s P}. For any state s, deﬁne Q(s) = min{Qf(P)|P is a path from start to s} and Qopt(s) = min{Qf(P)|P is an optimal path from start to s}. Finally, deﬁne Q = Q(goal) and Qopt = Qopt(goal).

Our gc algorithm, GC1, proceeds in two phases. In Phase I, it computes all least-cost paths from start to a selected state s and updates h(s). In Phase II it runs Algorithm C by Bagchi & Mahanti (1983) which we will refer to as Alg C on the given problem but using the updated h(s) value. Phase I can be done in various ways, for example as follows. Let s be the terminal state of a walk of a predetermined length expanding the leftmost child at each level, and let Cr the cost of that path to s. Then run a depth-ﬁrst search, without using h, with Cr as the cost bound, recording all the optimal paths to s. The new h(s) value is then Qopt(s) d(start, s).3 Thus ends Phase I. The new h(s) value deﬁned in this way has the important property that it does not change Qf(P) for any optimal path P from start to s. This property is instrumental in the proof of the following lemma.

Lemma 4. Given a problem for which Q = Qopt, the modiﬁed version of the problem produced by GC1 s Phase I also has Q = Qopt.

Alg C is identical to A* except in how it chooses nodes for expansion. It records the largest f-value expanded so far in a variable called F. When it selects a node to be expanded, it ﬁrst looks to see if there are any open nodes n with f(n) F. If there are, it chooses one with the smallest g-value. If there are not, it chooses an open node with the smallest fvalue, breaking ties in favour of nodes with the smallest gvalue. In either case, ties are broken in favour of the goal. The proof that GC1 is globally compatible with A*, i.e. it returns an optimal solution whenever A* does, is based on two properties of Alg C proven by Bagchi & Mahanti:4

(1) the solution returned by A* is never cheaper than the solution returned by Alg C (their Theorem 3.7 and Remark (i) following it). In particular, for our purposes, if A* returns an optimal solution for a problem then so does Alg C;

(2) the solution returned by Alg C is optimal if and only if Q = Qopt (their Theorem 3.8).

Together these imply that if A* solves a problem optimally then Q = Qopt. The modiﬁcation of h(s) made in GC1 s Phase I does not change this fact (Lemma 4). Since Q =

3This cannot be smaller than the original h(s) value because s is a state on all optimal paths from start to s. 4If Property (2) could be proven for A* then A* could be used in Phase II instead of Alg C.

Figure 1: GC1 does not expand s.e. node B.

Qopt in the modiﬁed problem, (2) guarantees Alg C will return an optimal solution for the modiﬁed problem, which, of course, is also an optimal solution for the original problem since GC1 has only modiﬁed h(s), not any edge costs. We have thus proven that GC1 returns an optimal solution whenever A* does. Figure 1 gives an example of a problem with an admissible heuristic that GC1 solves optimally without expanding s.e. state B. The execution of GC1 on this problem if B is the state s selected in Phase I is as follows. start A B is the only optimal path from start to B, so Qopt(B) = Qf(start A B) = f(A) = 21. Phase I therefore updates h(B) to be 21 2 = 19. In Phase II, Alg C halts immediately after expanding start because neither f(A) = 21 nor f(B) = 22 is less than f(goal). GC1 has found the optimal solution (start goal) without expanding B, which is s.e. given its original h(B) value. This completes our counterexample to Dechter & Pearl s Corollary 5. The key idea in this counterexample is that a state can be s.e. because its initially given heuristic value is excessively low and a process such as GC1 s Phase I can increase its heuristic value to the point that the state is no longer s.e.

6.1 An Alternative to Dechter & Pearl s Corollary 5 Deﬁnition 12. Search algorithm A cost-dominates search algorithm B iff for any problem P the cost of the solution returned by A, CA(P), is no worse than CB(P), the cost of the solution returned by B (formally, CA(P) CB(P)). For example, by proving that the solution returned by A* is never cheaper than the solution returned by Alg C, Bagchi & Mahanti proved that Alg C cost-dominates A*. We will now show that any algorithm that cost-dominates A* must expand all s.e. states. The proof is almost identical to Dechter & Pearl s proof of their Theorem 8 (presented in this paper as Theorem 16 in Appendix A). The key idea in the proof is the construction of a new problem from a given problem P and s.e. state s. We call the new problem Ps cd. It is identical to P except it has a new edge directly from s to goal with a suitably chosen cost (cs). Formally, Ps cd is deﬁned as follows. Deﬁnition 13. Given heuristic h, problem P = start, goal, σP , and s.e. state s, deﬁne cs = 1, and deﬁne

a new problem Ps cd = start, goal, σPs that is identical to P except for σPs cd(s), which differs from σP(s) as follows:

if goal / σP(s) then σPs cd(s) = σP(s) {(goal, cs)}, if (goal, ωP(s, goal)) σP(s) then σPs cd(s) = (σP(s) \ {(goal, ωP(s, goal))}) {(goal, cs)} Lemma 5. Let h, P, s, cs, and Ps cd be as in Deﬁnition 13. Then CA (Ps cd) C P. A crucial step in Dechter & Pearl s proof of their Theorem 8, which is also needed here, is the assertion that an algorithm that did not expand s.e. state s in solving P would behave identically on problem P and the new problem derived from it (Ps cd in our case). Eckerle et al. (2017) pointed out this is only true of certain kinds of algorithms and identiﬁed one set of assumptions that make this assertion true: that the algorithm be deterministic, expansion-based, and black box , or DXBB (see Appendix B). The following result about DXBB algorithms is what is needed in the proof of Dechter & Pearl s Theorem 8 and in the proof we are about to give about algorithms that cost-dominate A*. Lemma 6. Let A be a DXBB algorithm, P1 = start, goal, σ1 and P2 = start, goal, σ2 two problems with a common start and goal, and X the set of states expanded by A in solving P1. If σ1(s) = σ2(s) for all s X, then A will return the same solution when it solves P2 as it did when it solved P1. Theorem 7. Let h be a heuristic, P = start, goal, σP a problem, and s an s.e. state. Then any DXBB algorithm A that cost-dominates A* must expand s in solving P. Proof: Suppose, for the purpose of contradiction, that A succeeds on problem P without expanding s.e. state s and let X be the set of states expanded by A in solving P. Since s / X and s is the only state on which σPs cd and σP differ, the premises of Lemma 6 are satisﬁed, so A will return the same solution for problem Ps cd as it did for P. By Lemma 5, CA (Ps cd) C P and, by deﬁnition, C P CA(P) = CA(Ps cd). Thus, CA (Ps cd) CA(Ps cd), contradicting the theorem s premise that A cost-dominates A*.

Combining Theorem 7 and Lemma 3, we get the following optimal efﬁciency result, which is very much in the spirit of Dechter & Pearl s Theorem 11: Theorem 8. For any heuristic h, non-pathological problem P, and DXBB algorithm A that cost-dominates A*, if A h breaks ties in favour of goal then any state expanded by A h in solving P must also be expanded by A in solving P. It is instructive to consider why this theorem, with GC1 serving as algorithm A, is not contradicted by the example in Figure 1. The reason is that GC1 does not cost-dominate A* so Theorem 8 does not apply to GC1. To see that GC1 does not cost-dominate A*, consider the problem Ps cd created by applying the construction in Deﬁnition 13 to the problem in Figure 1 with s = B. This Ps cd is identical to Figure 1 except it has an edge of cost 0 (the algebra s 1) from B to goal. The solution returned by GC1 on this problem would be start goal costing 10, but A* would return the solution start B goal costing 3. This shows that GC1 does not cost-dominate A*. In general, although globally compatible algorithms are

required to return optimal solutions whenever A* does, they are allowed to return more costly solutions than A* s when A* s solutions are not optimal, as in this example.5

7 Results R4 and R5 (Dechter & Pearl s Corollary 7) Dechter & Pearl s Corollary 7 also considers the situation in which the heuristics are admissible but not necessarily consistent but with algorithms restricted in a different way. This corollary focuses on best-ﬁrst search algorithms, which are algorithms instantiating their BF* template using an evaluation function for paths, which we will denote F(P). F(P) computes a value for path P based on P s states, edge costs, and the heuristic values of P s states. We will write BF*(F) to denote BF* executed with the evaluation function F. In the context of their Corollary 7 they require F(P) = ω(P) if P is a path from start to goal but otherwise impose no restrictions on F. A particular function F that satisﬁes this requirement is, of course, A* s evaluation function: for path P from start to n, F(P) = ω(P) h(n). We will use lower case f to refer to this evaluation function. A*, in this notation, is BF*(f). Dechter & Pearl s Corollary 7, in our terminology, is this:6

For any heuristic h, non-pathological problem P PAD(h), and path evaluation function F such that BF*(F) succeeds on PAD(h), if A h breaks ties in favour of goal then any state expanded by A h in solving P must also be expanded by BF*(F) in solving P. Dechter & Pearl s Corollary 7 follows immediately from their Theorem 12(a), which is a mere restatement of their Lemma 6.7 We shall focus, therefore, on their Lemma 6. The following generalize Qf(P) (Deﬁnition 11) and Definition 8, respectively, to arbitrary path evaluation functions. Deﬁnition 14. For heuristic h, problem P, path P = s0, . . . , sn , and path evaluation function F, deﬁne QF (P) = F(s0, . . . , sk), where k [0, n] is such that F(s0, . . . , sk) F(s0, . . . , si) for all i [0, n]. Deﬁnition 15. Let h be a heuristic, P a problem, and F a path evaluation function. Then path P is strictly C - bounded w.r.t. F iff QF (P) C and state s is surely expanded w.r.t. F (abbreviated F-s.e.) iff there is a path from start to s that is strictly C -bounded w.r.t. F. When F = f this deﬁnition is identical to Deﬁnition 8 so states that we have been calling s.e. up to now will be called f-s.e. in the remainder of this section. Dechter & Pearl s Lemma 6, in our notation, is as follows. For any heuristic h and path evaluation function F, if BF*(F) succeeds on PAD(h) then for every problem P PAD(h) any state that is f-s.e. is also F-s.e.

5The ﬂaw in Dechter & Pearl s proof of their Corollary 5 occurs in the proof that a gc algorithm must expand all s.e. states (their Theorem 11). That proof incorrectly asserts that A* will always ﬁnd an optimal solution on their equivalent of Ps cd. 6Here we only give the portion of their Corollary 7 related to 0optimality. The corollary also includes claims about 1-optimality. 7In their proof of their Theorem 12(a) they mistakenly cite their Theorem 11 when they meant to cite their Lemma 6.

Their proof is as follows (of course, they used the numerical operators, not the cost-algebraic ones).8 Suppose, for the purpose of contradiction, that there exists a problem P PAD(h) in which some state s is f-s.e. but not Fs.e. Being f-s.e. means there is a path P from start to s such that Qf(P) C , and not being F-s.e. means there is no path R from start to s for which QF (R) C . In particular, QF (P) C . They then construct a problem P PAD(h) that consists only of path P, goal,9 and two edges to goal, one from s costing c1, the other from start costing c2. Figure 2 depicts P . They then deﬁne c1 and c2 so that the following all hold:

(A) (ω(P) c1) c2, (B) Qf(P) (ω(P) c1), and (C) c2 QF (P).

With these facts, the logic of the proof is as follows. Because of (A), C P = ω(P) c1, i.e. the only optimal solution in P is P followed by the edge from s to goal. (B) guarantees that all the heuristic values are admissible, i.e. that P PAD(h). Finally, (C) guarantees that the optimal path will not be found by BF*(F) because when it expands start it ﬁnds a solution path costing c2, so it will stop expanding states on P when it reaches the ﬁrst one whose F-value equals QF (P). Therefore, BF*(F) returns a suboptimal solution for P , contradicting the assumption that it succeeds on PAD(h). All that remains is to ﬁnd values of c1 and c2 that satisfy (A), (B), and (C). Here Dechter & Pearl exploit the fact that their costs can be arbitrary positive numbers. They deﬁne

c1 = Qf(P) c(P)

c2 = QF (P) + Qf(P)

It is easy to verify that (A), (B), and (C) hold with these definitions. However, our cost algebra does not have operations equivalent to subtraction or division by 2, so these deﬁnitions cannot be used, something different is needed if this proof is to be used within the cost-algebraic setting. Requirement (C) can always be satisﬁed by setting c2 = QF (P), which also gives the most latitude in satisfying (A).

8Their proof is not speciﬁc to s.e. states. What is actually being proven is that if BF*(F) succeeds on PAD(h) then for every path P, QF (P) Qf(P). 9Strictly speaking, Dechter & Pearl add two new goal states, one for each edge. The difference is immaterial.

Figure 2: The construction in Dechter & Pearl s Lemma 6.

But requirements (A) and (B) are problematic in the costalgebraic setting; a value of c1 satisfying both requirements is not guaranteed to exist, as we will show in a moment. If c2 is assigned the value QF (P) then setting c1 = h(s) always satisﬁes (A) since ω(P) h(s) = f(s) Qf(P) QF (P) = c2. If, in addition, f(s) = Qf(P), then (B) will be satisﬁed and the proof of the lemma completed. We therefore get Dechter & Pearl s Lemma 6, restricted to problems in PCONS(h), in the cost-algebraic setting by making two ﬁnal, simple observations: (1) if the given problem P is in PCONS(h) (as opposed to PAD(h)), then P constructed as described (with c1 = h(s) and c2 = QF (P)) is also in PCONS(h), and (2) in the cost-algebraic setting with a consistent heuristic f-values along a path can never decrease (just as in the traditional numerical setting).

Lemma 9. For any heuristic h and path evaluation function F, if BF*(F) succeeds on PCONS(h) then for every problem P PCONS(h) any state that is f-s.e. is also F-s.e.

Combining Lemma 9 and Lemma 3, we get R5:

Theorem 10. For any heuristic h, non-pathological problem P PCONS(h), and path evaluation function F such that BF*(F) succeeds on PCONS(h), if A h breaks ties in favour of goal then any state expanded by A h in solving P must also be expanded by BF*(F) in solving P.

Counterexample to the cost-algebraic version of Dechter & Pearl s Lemma 6. We will now give a counterexample showing that Dechter & Pearl s Lemma 6 is not true for PAD(h) in the costalgebraic setting. The algebra we will use in this example is deﬁned as follows. The values of the algebra are all integers of the form 5a + 7b where a and b are any nonnegative integers. For example, 22 is a value in this algebra (22 = 5 3 + 7 1) but 23 is not. The algebra s ordering is the usual ordering on integers. The algebra s operator is normal integer addition; 0 is the identity element ( 1 ). The set of values is closed under because (5a + 7b) + (5x + 7y) = 5(a + x) + 7(b + y). The function F(P) in this example computes the smallest cost possible for an extension of P that reaches goal taking into account the gand h-values of all the states in P. If the algebra contained all the non-negative integers this would simply be the largest f-value on P but because some integers are missing in this algebra, F(P) can be strictly larger than the largest f-value on P and still be admissible. This is precisely what Dechter & Pearl s Lemma 6 says cannot happen (see footnote 8). For example, consider the path P = start A B in Figure 3. Qf(P) = f(A) = 21 but F(P) = 22 is admissible because the cost of P is 12 and it is impossible to extend a path of cost 12 with values in the algebra to create a path costing exactly 21. The cheapest extension of P to goal that costs 21 or more has a cost of 22 so F(P) = 22 is admissible. In a problem where the only path from start to goal was path P followed by an edge from B to goal costing 10, C would be 22, and B would be f-s.e. but not F-s.e. Yet BF*(F), for the F deﬁned here, would succeed on PAD(h).

Figure 3: Counterexample to Dechter & Pearl s Lemma 6 in the cost-algebraic setting. F(start) = 0, F(start A) = 21, and F(start A B) = 22.

8 Nilsson s RESULT 6 RESULT 6 (p. 81) in (Nilsson 1980)10 is this: if heuristics h1 and h2 are admissible and h1(s) > h2(s) on every non-goal state s then every state expanded by A h1 is also expanded by A h2. Unlike the Dechter & Pearl results we have examined in this paper, RESULT 6 is about the relative pruning power of heuristics and not about whether A* is optimally efﬁcient with respect to other search algorithms. Nevertheless it is an intuitively appealing foundational result which, as we shall now show, does not hold in the cost-algebraic setting. Given two heuristics, h1 and h2, we deﬁne f 1 P (s) = g P (s) h1(s) and f 2 P (s) = g P (s) h2(s) for any state s and any path P from start to s. By f 1(s) and f 2(s), we refer to s s f-values on Open at a speciﬁc point in time.

Lemma 11. Let h1 and h2 be heuristics, P a problem, and s a non-goal state. If h1(s) h2(s), then f 1 P (s) f 2 P (s) for every path P from start to s.

It is crucial to note that isotonicity does not guarantee that f 1 P (s) f 2 P (s) when h1(s) h2(s), f 1 P (s) = f 2 P (s) when h1(s) h2(s) is permitted unless the isotonicity is strict.

Deﬁnition 16. A monoid Ω, , 1 with total order on Ωis strictly isotone iff Ω, , 1 with is isotone and, in addition, a b implies both (a c) (b c) and (c a) (c b) for all a, b Ωand all c Ω\ Zero, where Zero = {0} if there exists an element 0 Ωsuch that a 0 = 0 a = 0 for all a Ω(such an element is said to be absorptive ), and Zero = otherwise.

Edelkamp et al. (2005) did not insist on strict isotonicity because their theory did not require it and some of their applications, such as widest path, were not strictly isotone. The cost-algebraic equivalent of Nilsson s RESULT 6 is easily proven when strict isotonicity holds.

Lemma 12. Let Ω, , , 1 be a cost algebra, h1 and h2 heuristics, and P PAD(h1) PAD(h2) a problem. If Ω, , 1 with is strictly isotone and h1(s) h2(s) for every non-goal state s, then every state expanded by A h1 will also be expanded by A h2.

When the isotonicity is not strict, RESULT 6 does not hold, as seen in Figure 4. Here the optimal path, costing C , is start B goal. Using the stronger heuristic, h1, f 1(A) = f 1(B) so A h1 must choose between them. Without loss of generality, let A be the one it chooses. A is a deadend, so A h1 will expand both A and B in solving this problem. With

10Theorem 3-2 (p. 64) in (Nilsson 1971) is essentially identical.

the weaker heuristic, h2, A h2 will only expand B because f 2(B) f 2(A) and, once B is expanded, it will have a solution whose cost is C = f 2(A). This example cannot happen with strict isotonicity because it would ensure that f 1(A) f 2(A), which would mean either f 1(A) C , in which case A h1 would not expand A, or f 2(A) C , in which case A h2 would expand both A and B.

9 Conclusion One of the appealing properties of A* is its optimal efﬁciency in terms of the number of states expanded under certain circumstances, one is guaranteed not to fare better when using any alternative to A*. While Dechter & Pearl (1985) provided such results for the natural but rather restrictive setting of traditional shortest path problems, we established formal guarantees of A* s optimal efﬁciency for a much wider class of search problems. Our results are of importance in several ways. Firstly, the general cost-algebraic setting studied above can model scenarios that may well arise in practice but that are not covered by the traditional framework, such as, e.g., the widest path problem. Secondly, we pointed out a mistake in a result in the classical literature, and provided an alternative result of a similar ﬂavour. Thirdly, our formal results provide a much better understanding of known results on A* s optimal efﬁciency in the traditional shortest path setting, since they reveal which speciﬁcs of the traditional setting are essential for maintaining A* s optimal efﬁciency. For example, R4 turns out to be incorrect when dropping the speciﬁc requirement that all non-negative integers be allowed as values, and Nilsson s RESULT 6 relies on strict isotonicity. Knowing which requirements are of importance for which type of optimality result may potentially help guide the design of optimally efﬁcient algorithms for new types of search problems.

10 Acknowledgements Financial support for this research was in part provided by Canada s Natural Sciences and Engineering Research Council (NSERC) and by a grant to the ﬁrst author from the Fac-

Figure 4: State A is expanded using the stronger heuristic (h1) but not using the weaker heuristic (h2).

ulty of Science and the Computing Science Department of the University of Alberta.

Appendix A Results R1 and R2 Result R2 is a special case of R1, so we only need to show that R1 holds in the cost-algebraic setting. Our presentation of R1 s proof is different than Dechter & Pearl s, but the proof itself is almost identical. The basis for the proof of R1 is Dechter & Pearl s Theorem 8 (our Theorem 16). The key idea in proving this theorem is the construction of a new problem from a given problem P and s.e. state s. We call the new problem Ps. It is identical to P except it has a new edge directly from s to goal with a suitably chosen cost (cs). Formally, Ps is deﬁned as follows. Deﬁnition 17. Given heuristic h, problem P = start, goal, σP PCONS(h), and s.e. state s, deﬁne cs = h(s), and deﬁne a new problem Ps = start, goal, σPs that is identical to P except for σPs(s), which differs from σP(s) as follows: if goal / σP(s) then σPs(s) = σP(s) {(goal, cs)}, if (goal, ωP(s, goal)) σP(s) then σPs(s) = (σP(s) \ {(goal, ωP(s, goal))}) {(goal, cs)} Lemma 13. Let h, P, s, Ps, and cs be as in Deﬁnition 17. Then Ps PCONS(h). Lemma 14. Let h, P, s, and cs be as in Deﬁnition 17. Then d(start, s) cs C P. Proof: Because s is s.e., there is a strictly C P-bounded path P from start to s. Since s is a state on this path, we have f P (s) = ω(P) h(s) C P. By the deﬁnition of d(start, s), d(start, s) ω(P). Because the algebra is isotone, this implies d(start, s) h(s) ω(P) h(s). From d(start, s) h(s) ω(P) h(s) and ω(P) h(s) C P it follows that d(start, s) cs C P. Corollary 15. Let h, P, s, Ps and cs be as in Deﬁnition 17. Then C Ps C P. Dechter & Pearl s Theorem 8 then follows. Theorem 16. For any heuristic h, problem P = start, goal, σP PCONS(h), and s.e. state s, any DXBB algorithm A that succeeds on PCONS(h) must expand s in solving P. Proof: Suppose, for the purpose of contradiction, that A succeeds on problem P without expanding s.e. state s and let X be the set of states expanded by A in solving P. Since s / X and s is the only state on which σPs and σP differ, the premises of Lemma 6 are satisﬁed, so A will return the same solution for problem Ps as it did for P. The cost of this solution is C P, but C Ps C P (Corollary 15), so A did not succeed on Ps, contradicting the theorem s premise that A succeeds on PCONS(h).

Combining Theorem 16 and Lemma 3, we get R1. Theorem 17. For any heuristic h, non-pathological problem P PCONS(h), and DXBB algorithm A that succeeds on PAD(h), if A h breaks ties in favour of goal then any state expanded by A h in solving P must also be expanded by A in solving P.

Algorithm 1: Generic DXBB Search Algorithm

Input: start, goal, σ , h Output: a least-cost path from start to goal

1 S0 (λ, {(start, 0)})

2 for t from 1 to do

3 if Stopping Condition(t, S0, . . . , St 1, goal, h) then

4 return Solution(t, S0, . . . , St 1, goal)

5 st Choose(t, S0, . . . , St 1, goal, h)

6 St (st, σ(st))

Appendix B DXBB Algorithms Algorithm 1 is the template for DXBB algorithms by Eckerle et al. (2017) adapted to the present setting. A DXBB algorithm is any algorithm that can be implemented by instantiating Stopping Condition, Solution, and Choose in this template with deterministic functions. On iteration t Algorithm 1 computes St = (st, σ(st)) recording the state st that was expanded and the results of that expansion (st must be chosen from among the states generated in previous iterations). S0 is (λ, {(start, 0)}). S0, . . . , St 1 is called the expansion sequence up to iteration t. All the functions (Stopping Condition, Solution, and Choose) are given the expansion sequence but are not given σ, so they cannot peek ahead to see what lies beyond a state that has not yet been expanded. For more details see (Eckerle et al. 2017).

References Bagchi, A., and Mahanti, A. 1983. Search algorithms under different kinds of heuristics a comparative study. J. ACM 30(1):1 21. Dechter, R., and Pearl, J. 1985. Generalized best-ﬁrst search strategies and the optimality of A*. J. ACM 32(3):505 536. Eckerle, J.; Chen, J.; Sturtevant, N. R.; Zilles, S.; and Holte, R. C. 2017. Sufﬁcient conditions for node expansion in bidirectional heuristic search. In Proc. 27th Intl. Conf. on Automated Planning and Scheduling (ICAPS), 79 87. Edelkamp, S.; Jabbar, S.; and Lluch Lafuente, A. 2005. Cost-algebraic heuristic search. In Proc. 20th National Conference on Artiﬁcial Intelligence (AAAI), 1362 1367. Hart, P. E.; Nilsson, N. J.; and Raphael, B. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE T. Syst. Sci. Cyb. 4(2):100 107. Nilsson, N. J. 1971. Problem-Solving Methods in Artiﬁcial Intelligence. Mc Graw-Hill. Nilsson, N. J. 1980. Principles of Artiﬁcial Intelligence. Tioga Press. Pearl, J. 1984. Heuristics Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley. Sobrinho, J. L. 2002. Algebra and algorithms for Qo S path computation and hop-by-hop routing in the internet. IEEEACM T. Network. 10(4):541 550.