# enriching_nonparametric_bidirectional_search_algorithms__4be42b1e.pdf

The Thirty-Third AAAI Conference on Artiﬁcial Intelligence (AAAI-19)

Enriching Non-Parametric Bidirectional Search Algorithms

Shahaf S. Shperberg CS Department Ben-Gurion University Be er-Sheva, Israel shperbsh@post.bgu.ac.il

Ariel Felner ISE Department Ben-Gurion University Be er-Sheva, Israel felner@bgu.ac.il

Nathan R. Sturtevant CS Department University of Alberta Canada sturtevant@cs.du.edu

Solomon E. Shimony Avi Hayoun CS Department Ben-Gurion University Be er-Sheva, Israel shimony@cs.bgu.ac.il hyounav@cs.bgu.ac.il

NBS is a non-parametric bidirectional search algorithm proven to expand at most twice the number of node expansions required to verify the optimality of a solution. We introduce new variants of NBS that are aimed at ﬁnding all optimal solutions. We then introduce an algorithmic framework that includes NBS as a special case. Finally, we introduce DVCBS, a new algorithm in this framework that aims to further reduce the number of expansions. Unlike NBS, DVCBS does not have any worst-case bound guarantees, but in practice it outperforms NBS in verifying the optimality of solutions.

1 Introduction and Overview Given a graph G, the shortest-path problem is to ﬁnd the least-cost path from state s to state g in G. Bidirectional heuristic search algorithms (denoted henceforth by Bi-HS) interleave two separate searches, a search forward from s and a search backward from g. Recent research (Eckerle et al. 2017) deﬁned conditions on the node expansions required by Bi-HS algorithms to guarantee solutions optimality. Following work reformulated these conditions as a mustexpand graph (GMX), showing that the Minimum Vertex Cover (MVC) of GMX corresponds to the minimal number of expansions (Chen et al. 2017) required to prove optimality. Finally, Shaham et al. (2017; 2018) studied the GMX structure and its extension, GMXϵ, that exploits knowledge of the minimal edge cost (ϵ), to characterize properties of the MVC. Bi-HS algorithms can be classiﬁed as parametric or as non-parametric. Two parametric algorithms were recently developed. Fractional MM (f MM(p)) (Shaham et al. 2017) generalizes the MM algorithm (Holte et al. 2017) by controlling the fraction p of the optimal path at which the forward and backward frontiers meet. There exists an optimal fraction p for which f MM(p ) will expand exactly an MVC of GMX, but p is not known a priori. Another parametric algorithm is GBFHS (Barley et al. 2018), which iteratively increases the depth of the search. It is parametric in a predeﬁned split function that determines how deep to search on each side at each iteration. GBFHS with an optimal split function also converges to an MVC of GMX. However, such a split function is not known a priori. Without knowledge of

Copyright c 2019, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

the optimal parameter values, both algorithms may expand many more nodes than an MVC of GMX. In this paper we focus on non-parametric Bi-HS algorithms. NBS (Chen et al. 2017) is a robust state-of-the-art non-parametric algorithm that computes a vertex cover (VC) of GMX whose size is at most 2|MVC|. We enrich this line of research and introduce new settings and new algorithms that aim to ﬁnd a VC of GMX. In particular, we make the following contributions: (1) We describe and motivate the problem of ﬁnding all optimal solutions, and introduce two new versions of GMX (with/without ϵ) that are suited for such settings. This results in four different problem settings, each with its own GMX. (2) We introduce a 2-level framework for non-parametric Bi-HS algorithms and reformulate NBS as a special case. (3) Utilizing our framework, we adapt NBS to the four settings, while maintaining the 2|MVC| guarantee. (4) We introduce a new algorithm Dynamic Vertex Cover Bidirectional Search (DVCBS). It uses the same high-level framework we developed, but unlike NBS, always tries to expand a VC of a dynamic GMX graph which is also introduced. Here too, four versions are possible. (5) Our experimental results show that the new variants of NBS, as well as DVCBS, outperform previous variants of NBS for ﬁnding both the ﬁrst and all optimal solutions, expanding signiﬁcantly fewer nodes in many cases.

2 Deﬁnitions and Background Let d(x, y) denote the shortest distance between x and y, C = d(s, g), and let f F , g F and h F indicate f-, g-, and h-costs in the forward search, and likewise f B, g B and h B in the backward search. The forward heuristic h F is admissible iff h F (u) d(u, g) for every state u G and is consistent iff h F (u) d(u, u ) + h F (u ) for all u, u G. The backward heuristic h B is deﬁned analogously. Front-to-end Bi-HS algorithms use these two heuristic functions and in this paper we assume that both are admissible and consistent. Front-to-front Bi-HS algorithms use heuristics between pairs of states on opposite frontiers, and are outside the focus of this paper; see Holte et al. (2017) for a survey.

2.1 Guaranteeing Solution Optimality Unidirectional search algorithms must expand all nodes n with f(n) < C in order to guarantee the optimality of so-

lutions (Dechter and Pearl 1985). Eckerle et al. (2017) generalized this to Bi-HS by examining pairs of nodes u, v such that u is in the forward frontier and v is in the backward frontier. They deﬁned conditions for when such pairs should be expanded: 1. f F (u) < C 2. f B(v) < C 3. g F (u) + g B(v) < C

If u and v meet the three conditions, then to guarantee solution optimality every algorithm must expand at least one of u or v in order to ensure that there is no path from s to g passing through u and v of cost < C .

Deﬁnition 1. For each pair of states (u, v) let lb(u, v) = max{f F (u), f B(v), g F (u) + g B(v)}

In Bi-HS, a pair of states u, v is called a must-expand pair (MEP) if lb(u, v) < C . The MEP deﬁnition is equivalent to the above conditions; for each MEP only one of u or v must be expanded. In the special case of unidirectional search, algorithms expand all the nodes with f F < C , which is equivalent to expanding the forward node of every MEP. Bi-HS algorithms may expand nodes from either side, potentially covering all the MEPs with fewer expansions. Shaham et al. (2018) generalized the three conditions to handle the case where a lower bound ϵ on the edge costs is available. In unit edge-cost domains ϵ = 1, while in other domains one might iterate over all action costs and set ϵ to their minimum. We denote this case by ϵ-case, as opposed to the base-case, where no knowledge of ϵ is available. For ϵ-case, Condition 3 is changed to: 3. g F (u) + g B(v) + ϵ < C Consequently, the lower bound is changed to: lb(u, v) = max{f F (u), f B(v), g F (u) + g B(v) + ϵ} and an MEP is deﬁned according to the new lb.

2.2 The Must-Expand Graph (GMX) The problem of selecting the minimal set of nodes that cover all MEPs can be restated as ﬁnding an MVC on the mustexpand graph (Chen et al. 2017).

Deﬁnition 2. The Must-Expand Graph (GMX) of a problem instance is an undirected, unweighted bipartite graph. For each state u G there is a left vertex u F and a right vertex u B. GMX has an edge between a left vertex u F and a right vertex v B if and only if (u, v) is an MEP.

It follows that Bi-HS algorithms must expand a vertex cover (VC) of the induced GMX when solving a problem instance. The MVC is thus a lower bound on the number of expansions. Another version of GMX, denoted by GMXϵ, can be constructed for ϵ-case (Shaham et al. 2018). Figure 1 illustrates different versions of GMX for the problem instance in Figure 1(a), in which C = 3. Figure 1(b) shows the corresponding GMX. The left (right) vertices are ordered by increasing (decreasing) g F -costs (g B-costs). Additionally, vertices with identical g F (or g B) are merged into a single weighted vertex, denoted as a cluster. For example the cluster with g F = 1 includes both A and X and its weight is 2. Similarly, an edge that connects clusters represents all possible edges between them (the product of their weights),

(a) Problem Instance

(b) GMX (c) GMXϵ

(d) GMXA (e) GMXAϵ

Figure 1: Case Study: Different versions of GMX

e.g., 6 edges connect the cluster with g F = 1 to the one with g B= 1. Figure 1(c) shows GMXϵ (ϵ = 1). Due to the addition of ϵ, some edges that exist in GMX no longer exist in GMXϵ. For example, the left cluster (vertex) with g F = 1 is connected to all right clusters with g B 1 in GMX but is only connected to the right cluster with g B = 0 in GMXϵ.

2.3 The Minimum Vertex-Cover of GMX

Shaham et al. (2017) introduced Calculate MVC() (see their Section 6.5), an algorithm for ﬁnding an MVC of a GMX. This algorithm relies on the fact that all such MVCs are contiguous and restrained in both directions. That is, there exist thresholds t F , t B R such that t F + t B = C (t F + t B + ϵ = C for ϵ-case) for which a vertex u in direction D is in the MVC if and only if g D(u) < t D. Calculate MVC() iterates over all relevant pairs of values for which t F +t B = C and ﬁnds the pair which induces the MVC. For example, in Figure 1(b) the MVC (colored blue) is induced by t F , t B = 2, 1 and includes only four nodes (s, A, X, g). Calculate MVC() runs in time linear in number of clusters (O(C )) but assumes that GMX and C are given as input. Thus, it can only run post-priori, after C was found and the entire GMX was fully built (e.g., by running A* from both sides). Such information is not available to any Bi-HS algorithm during execution. Therefore, Bi-HS algorithms cannot guarantee that the VC they ﬁnd is minimal. Hence, a main challenge of Bi-HS is to approximate an MVC by using only information available during the search.

3 Finding All Optimal Solutions A common practice in the heuristic search literature is to halt the search once the ﬁrst optimal solution is found and veriﬁed. This problem comprises two tasks: (1) ﬁnding a solution of cost C and (2) verifying that there are no solutions with cost < C . Most search algorithms interleave these tasks, completing them in an arbitrary order. The GMX analysis above only handles the second task. Therefore, an MVC of GMX may not capture the extra work needed to complete the ﬁrst task of ﬁnding a solution (but |MVC| is still a lower bound on the entire search). This is similar to only counting nodes with f < C as necessary expansions in unidirectional search, and omitting nodes with f = C that are expanded to ﬁnd the goal (Dechter and Pearl 1985). In many cases, all optimal solutions are required. For example, if not all the problem constraints can be encoded due to privacy issues, competing objectives, partial knowledge, etc. then an external decision maker is needed to choose a solution from the set of all optimal solutions (Byers and Waterman 1984; Arthur et al. 1997; Mahadevan and Schilling 2003). In other cases a solution may become invalid and an alternative solution needs to be obtained quickly (Siegmund, Ng, and Deb 2012; Isermann 1977). We denote these problem spaces by ϵ-ALL-case when knowledge of ϵ exists, and base-ALL-case otherwise. Finding all optimal solutions only consists of a single compound task: verifying that there are no undiscovered solutions with cost C (as this includes the task of ﬁnding solutions with cost C ). Thus, we can generalize the analysis in Section 2.1 to the case of ﬁnding all solutions in a way that allows us to bound the number of expansions required for the entire search. In addition, we show below that using this formalization also helps in ﬁnding a ﬁrst solution faster.

3.1 GMX for Finding All Optimal Solutions The ﬁrst step in generalizing the analysis for the task of ﬁnding all solutions is to re-deﬁne MEPs to use instead of < in the three conditions. Let u and v be nodes in the forward and backward frontiers, respectively. There can be an optimal path (of cost C ) that goes from s to u to v to g, if: 1. f F (u) C 2. f B(v) C 3. g F (u) + g B(v) C

Likewise, g F (u) + g B(v) + ϵ C is used in the ϵ-ALL-case. We deﬁne a pair of states (u, v) to be an MEP for the all cases (we call such pairs must-expand-all pairs, or MEAPs) if lb(u, v) C , where lb(u, v) is again the maximum of the three terms. Theorem 1. Let I = G(V, E), s, g . A Bi-HS algorithm B will ﬁnd all optimal paths in I if and only if B expands at least one state from every MEAP.1

proof. If Case: Assume that B found all optimal paths but there is an MEAP u, v where neither u nor v were expanded by B. Consider the two paths: U from s to u with a cost

1We assume B is DXBB (See (Eckerle et al. 2017)). We also assume B maintains a frontier of all unexpanded discovered nodes, from which nodes are removed only upon expansion.

of g F (u); and V from v to g with the cost of g B(v). Let I = G (V, E) , h be a problem instance where u, v is an edge with cost ϵ. Therefore, there is a path P = U V from s to g in G . Since u, v is an MEAP, the cost of P is g F (u) + d(u, v) + g B(v) = g F (u) + g B(v) + ϵ C . However, B(I ) = B(I) P, contradicting the assumption that all optimal paths from s to g were found by B.

Only-If Case: Assume that B expanded at least one state from every MEAP, and there exists an optimal solution P = s = p0, . . . , pk = g that was not found. Since the heuristics are admissible, for all 0 i k, f F (pi) C , f B(pi) C . Since P was not found, there exist nodes pi, pj P, pi = pj, in the forward frontier and backward frontiers of B respectively, when the search terminates. P is an optimal path, thus, g F (pi) + g B(pj) + d(pi, pj) = C . Since ϵ is a lower bound on the distance between nodes, g F (pi) + g B(pj) + ϵ g F (pi) + g B(pj) + d(pi, pj) = C . Hence pi, pj is an MEAP, contradicting the assumption that B expanded at least one state from every MEAP.

Note that the proof holds in base-ALL-case if ϵ = 0.

We use the new must-expand-all conditions to deﬁne two new graphs: GMXA for base-ALL-case, and GMXAϵ for ϵ-ALL-case, in a manner similar to GMX and GMXϵ respectively, but with the conditions. Importantly, |MVC| of GMXA and GMXAϵ is a lower bound on the number of nodes that must be expanded to complete the joint task of ﬁnding all optimal solutions and verifying that there are no cheaper solutions. By contrast, |MVC| of GMX and GMXϵonly bounds the minimal number of expansions to complete the (second) task of verifying that no solution with cost < C exists.

GMXAand GMXAϵ for the example in Figure 1(a) are shown in Figures 1(d) and 1(e), respectively. As can be seen, each vertex has more neighbors due to the use of instead of < in condition 3. For example, the cluster with g F = 1 is now also connected to the cluster with g B = 2. Furthermore, since conditions 1 and 2 now also have , GMXA contains additional clusters (e.g., with g F = 0) and existing clusters may now be composed of additional states (e.g., yi with g F = 1 are included in GMXA but not in GMX).

Since GMXA includes more edges than GMX, the contiguous partition of their MVCs may be different, as demonstrated in Figure 1. The MVC of GMX (Figure 1(b)) is composed of the vertices {s, A, X} in the forward direction and {g} in the backward direction. The MVC of GMXA (Figure 1(d)) is composed of vertex s in the forward direction and {g, D, C, B, A} in the backward direction. Note that X is part of the MVC of GMX but not a part of the MVC of GMXA.

As a result, existing Bi-HS algorithms that consider GMX when aiming to ﬁnd a ﬁrst solution should be modiﬁed to consider GMXA when trying to ﬁnd all optimal solutions. For example, the optimal fraction of f MM(p) for ﬁnding all solutions ( 1

4 for Figure 1(a)) is different from the optimal fraction for ﬁnding a ﬁrst solution ( 2

3). Furthermore, in section 4.2 we demonstrate that algorithms which consider GMXA may be even better at ﬁnding the ﬁrst solution.

Algorithm 1: LBF high-level

2 LB min{h F (s), h B(g)}

3 while LB < C do

4 C=Expand Level(LB,C)

5 Increase LB to the next value

Algorithm 2: NBS Expand Level (LB, C)

1 while true do

2 while min f in waiting D < LB do

3 move best node from waiting D to ready D

4 if ready D waiting D empty then

5 Terminate search - no solution was found

6 if ready F .g + ready B.g LB then

7 Expand D(C) node with min g D-value in ready D 8 else

9 if waiting D.f LB then

10 move best node from waiting D to ready D 11 else

12 return C

4 A General Framework Encompassing NBS Near-Optimal Bidirectional Search (NBS) (Chen et al. 2017) is a robust state-of-the-art non-parametric algorithm that is guaranteed to expand a VC of GMX whose size is at most 2|MVC|. In this section, we introduce a generalization of NBS: a two-level framework which we call the Lower-Bound-Framework (LBF). NBS is a speciﬁc implementation of the low level of LBF. We then introduce additional algorithms in this family which differ in their decisions at the low level of LBF. LBF has two levels. The high level (Algorithm 1) maintains and dynamically increases a global lower bound (LB) on the cost of an optimal solution. It keeps track of all states in the frontiers (OPEN lists) of the two directions of the search. For each node pair u, v , lb(u, v) is deﬁned according to Deﬁnition 1 above, depending of course, on the exact case (base-case, ϵ-case etc.). The global lower bound LB is set to be the minimal lb among all pairs.2 The low level of LBF then needs to select valid nodes for expansion, i.e., nodes that may be part of paths of cost LB. All the algorithms in the LBF family discussed in this paper use the same high level, but differ in the low-level selection policy.

4.1 The Low-Level Expansion Policy of NBS The low-level policy of NBS is based on an approximate VC algorithm (Papadimitriou and Steiglitz 1982) which re-

2Other Bi-HS algorithms also maintain and increase a global lower bound on the optimal solution, e.g., C in MM and f Lim in GBFHS. These bounds use less information than LB of LBF which directly depends on current knowledge on MEP as deﬁned by the GMX theory and therefore is tighter.

peatedly chooses an edge and adds both its endpoints to the VC. Therefore, NBS repeatedly ﬁnds a pair u, v for which lb(u, v) LB and expands both u and v. The implementation details of NBS, as done by the original authors (outlined in Algorithm 2) are as follows. The frontier for each direction D is split into two separate queues: waiting D (sorted by f-value), which serves as a gateway to ready D (sorted by g-value). Nodes with a minimal f-value are moved from waiting D to ready D, and only nodes from ready D are expanded. In the pseudo codes, every line which includes D is repeated twice, once for each direction. First (Lines 2 3), all nodes for which f D(u) < LB are moved to ready D. Next (Lines 6 7), NBS selects a pair of nodes u ready F and v ready B for which g F (u)+g B(v) LB, and expands both u and v. If no such pair is found, NBS repeatedly moves a pair of nodes for which f F (u) LB and f F (u) LB from waiting D into ready D (Line 10) and continues to look for a pair for which g F (u) + g B(v) LB. If such a pair is still not found, the low level reports back to the high level that no valid pairs were found, causing LB to be incremented. Chen et al. (2017) proved three properties of NBS: (1) It is guaranteed to ﬁnd an optimal solution. (2) It expands at most 2|MVC| states while ﬁnding a VC in GMX. (3) No other Bi-HS algorithm can have better worst-case performance.

4.2 Finding All Optimal Solutions with NBS The original low level used for NBS by Chen et al. (2017) is based on the properties of MEPs which use < C in all three conditions. Therefore, NBS ﬁrst considers nodes with f F and f B which are strictly less than LB (Line 2). Nodes with f F and f B that equal LB are only added lazily later (Lines 9 10 of Algorithm 2). We use NBSF and NBSF ϵ (F for ﬁrst solution) to denote the original versions (Algorithm 2) for the base-case and ϵ-case, respectively. In order to be better suited for for ﬁnding all solutions we adapt the low-level expansion policy of NBS to be based on MEAPs which have in the three conditions. Speciﬁcally, we modify the NBSF expansion policy to immediately consider all nodes for which f D(u) LB by changing the < condition in Line 2 of Algorithm 2 to be . This change also eliminates Lines 9 11, as such nodes are handled eagerly in Line 2. We use NBSA and NBSAϵ (A for all solutions) to denote these new versions which use the modiﬁed expansion policy (with in Line 2) and aim to ﬁnd a vertex cover of GMXA and GMXAϵ respectively. Note that there are many possible ways to implement the low level of NBS in terms of how to move nodes from waiting D to ready D. NBSF and NBSA are special cases directly inspired by GMX and GMXA.

4.3 Finding a First Solution with NBSA An interesting phenomenon is that although NBSA is designed to ﬁnd all solutions, it may expand fewer nodes than NBSF , even when ﬁnding the ﬁrst solution. The explanation for this is as follows. The low level of NBSA utilizes more information about GMX when making a decision. In an iteration where LB < C , nodes with f = LB are part of GMX, and considering them earlier helps in increasing LB faster, thus ﬁnding an MVC faster. In iterations where LB = C , a

Figure 2: Comparing NBSA and NBSF

VC of GMX has already been found, and nodes with f = LB can lead to a solution if one was not yet discovered. An example of this phenomenon is presented in Figure 2, where C = 4 (the edge between s and g). Both NBSF and NBSA begin by expanding s, g (LB = 1), followed by A, H (LB = 2), at which point LB is incremented to 3. For NBSF ready B will contain only K (f F (K) = 2 while f = LB = 3 for all other nodes). NBSF will expand B, K before moving other nodes to ready B (using Lines 10 11). Next, it will expand C, E , D, F and terminate after expanding 10 nodes. By contrast, in NBSA after setting LB = 3 ready B will contain E,F,I,J and K (all with f LB = 3). Since nodes with lower g-values are expanded ﬁrst, NBSA will expand B, E and C, F , terminating with 8 node expansions, without expanding nodes K and D (since g F (D) + g B(K) = 4 > 3 = LB). Our experiments below suggest that this phenomenon is rather common in practice. Note that every pair expanded by NBSA in every iteration where LB < C is an edge of GMX. Thus, NBSA retains the 2|MVC| bound until ﬁnding a VC of GMX.

5 Bidirectional Search using Dynamic VC

We now introduce a new family of algorithms called Dynamic Vertex Cover Bidirectional Search (DVCBS). It uses the high level of LBF but conceptually differs from the NBS family in its low-level expansion policy. While NBS always expands both nodes of a chosen MEP, DVCBS works by maintaining a dynamic version of GMX (DGMX) and greedily expanding an MVC of the DGMX at each step. DGMX is deﬁned as follows. Its structure resembles GMX, with two main differences: (1) The full GMX is not available during the search. Instead, DGMX contains only nodes in the forward frontier (generated not expanded) for constructing left vertices, and only nodes from the backward frontier for constructing right vertices. (2) The value of C is not known during the search, thus edges of DGMX are deﬁned on pairs u, v such that lb(u, v) < LB. Since LB C , all such pairs are in fact MEPs of GMX. Note that DGMX shares all the interesting properties of the full GMX. Thus, vertices with the same g-value can be merged to form a weighted vertex (cluster). More importantly, Calculate MVC() can be directly applied to DGMX in time linear in the number of its clusters. This is done in all low-level variants of DVCBS presented next.

5.1 Low-Level Expansion Policy in DVCBS

There are many possible low-level expansion policies based on DGMX and on its MVC. Every node expansion deletes vertices and may add new vertices to DGMX, invalidating the most recently computed MVC. However, computing the MVC every time DGMX changes incurs extra overhead (albeit linear in the number of clusters in DGMX). Thus, an efﬁcient expansion policy should balance between expanding many nodes and maintaining the most up-to-date DGMX and MVC. We experimented with multiple expansion policy variants, and found that an efﬁcient balance between these two extremes is to expand a single cluster (containing all nodes with the same g F - or g B-value) in every iteration of the high level. This results in a manageable amount of MVC computations, while working on reasonably up-to-date information. Furthermore, since all vertices in a cluster have the same g-value, LB may increase only after expanding an entire cluster but never before. We only report experimental results for this variant. DVCBS contains several other decision points. First, there can be several possible MVCs for a given DGMX. Additionally, as mentioned above, one cluster from MVC should be chosen and expanded. Finally, the way we order nodes within the cluster for expansion may affect the number of expansions before reaching a solution when LB = C . We have experimented with many possible decision choices but report the results in Section 6 using the best variant as follows. Select the cluster with the smallest number of nodes among the clusters with minimal g F - and g B-values, among all MVCs. Tie breaking for speciﬁc node expansion within a cluster orders nodes according to their order of discovery. Pseudo code of the low level of DVCBS appears in Algorithm 3. The life cycle of DVCBS includes the following steps: (1) initialize DGMX, (2) Calculate MVC(), (3) choose the cluster of nodes to expand from the MVC, and (4) update DGMX. Steps 2-4 are repeated until either an optimal solution is found or no possible solution exists. To execute efﬁciently, DVCBS uses data structures denoted as Cwaiting D and Cready D, which are similar to the waiting D and ready D queues of NBS, modiﬁed to use clusters.

5.2 Variants of DVCBS

Like NBS, DVCBS also has four variants corresponding to the four versions of GMX. The variants that use GMX and GMXϵ are denoted by DVCBSF and DVCBSF ϵ which lazily move nodes with f D = LB from Cwaiting D to Cready D. Likewise, variants that use DGMXA (a dynamic graph based on GMXA, i.e., based on the conditions of MEAPs) can be derived by adapting the low-level expansion policy to GMXA and GMXAϵ. Speciﬁcally, as was done for NBS, we modify the DVCBS expansion policy to immediately consider all nodes for which f D(u) LB by changing the < condition in Line 2 of Algorithm 3 to be . This change also eliminates Lines 11 13, as we handle such nodes immediately in Line 2. These variants are called DVCBSA and DVCBSAϵ. Here too, DVCBSA can also be used to ﬁnd a ﬁrst solution, sometimes faster than DVCBSF , as we demonstrate using Figure 2. Initially, LB = 1. Since no nodes have f D < LB,

Algorithm 3: DVCBS Expand a Level

1 while true do

2 while min f in Cwaiting D < LB do

3 Move best cluster from Cwaiting D to Cready D

4 if Cready D Cwaiting D empty then

5 Terminate search - no solution was found

6 DGMX Build DGMX(Cready D)

7 if DGMXis not empty then

8 MV C find MV C(DGMX)

9 Choose and Expand a cluster from MVC of DGMX. 10 else

11 if Cwaiting D.f LB then

12 Move best cluster from Cwaiting D to Cready D 13 else

14 return true

DGMX = DGMXA = {UF = {s}, VB = {g}, E = { s, g }}. Assume that both DVCBSF and DVCBSA selected s for expansion and so {A, B, C} are added to waiting F . Their minimal f-value is 2 (A and B) so LB = 2. There are no clusters in waiting F with f F < LB, thus, {A,B} are moved to ready F and DGMX = DGMXA = {UF = {A, B}, VB = {g}, E = { A, g , B, g }}. Therefore, {g} is the MVC, and both algorithms expand g and add {E, F, H, I, J} to waiting B. Next (LB is still 2), H is added to ready B and since H is the MVC, it is expanded and K is added to waiting B. Now, {K} is the only cluster in waiting B with f B LB. Since g B(K) = 2 and gmin F = 1 ({A, B}) LB is incremented to 3. At this point the algorithms diverge. DGMXA moves C to ready F and {E, F, I, J, K} to ready B. Thus, DGMXA includes 3 clusters with f D LB = 3: {A, B, C} with g F = 1 in ready F , and two clusters in ready B: {E, J, F, I} with g B = 1, and {K} with g B = 2. Thus, DVCBSA expands cluster {A, B, C} (it is the MVC), then, D is generated and expanded and DVCBSA terminates after expanding a total of 7 nodes (s, g, H, A, B, C and D). By contrast, when LB = 3, DGMX contains only two clusters with f D < LB = 3: {A, B} (with g F = 1) in ready F and {K} (with g F = 2) in ready B. Thus, DVCBSF expands K (node C, as well as {E, J, F, I} are added to ready D, with f D = LB = 3). Then it expands cluster {A, B, C}. Next it exapnds D and terminates, after expanding a total of 8 nodes (s, g, H, K, A, B, C and D). Recall that NBSF expands 10 nodes and NBSA expands 8 on this example.

5.3 No Upper Bound Guarantees for DVCBS

The most important property of NBS is the 2 bound guarantee. While DVCBS outperforms NBS on average (see experiments below), DVCBS is not bounded in its worst case. A synthetic example and its GMX demonstrate this in Figure 3. The optimal path is s, X, g of cost k+(k 1) = 2k 1. Note that there is a longer path to X via the vi nodes of cost

Figure 3: An example for unbounded behavior of DVCBS

2k+1. In this example, the MVC of GMX includes three nodes (g, X and Y in the backward direction, all colored blue). We next show that DVCBS never expands Y , and therefore has to expand at least k + 2 nodes all connected to Y in GMX. To expand Y , an algorithm needs to generate it by expanding g. If at any point DVCBS chooses to expand g then DGMX will have two nodes in the backward side ({X, Y }) and a single node in the forward side (s or one of the Vi nodes). Thus, the MVC of DGMX is always in the forward direction (choosing the Vi node), and DVCBS has to expand all of s, V1, . . . , Vk 1 before converging to the size k + 1 VC of GMX. Otherwise, if g is never chosen for expansion, DVCBS always chooses to expand nodes in the forward direction and it has to expand k + 2 nodes (s, X and all of the Vis) in order to ﬁnd a VC. In both cases, DVCBS expands more than k nodes. Since k can be arbitrarily large, DVCBS is not bounded by a constant factor of the MVC.

6 Experimental Evaluation

We ran experiments on four domains: (1) 50 14-Pancake Puzzle instances with the GAP heuristic (Helmert 2010). To get a range of heuristic strengths, we also used the GAP-n heuristics (for n = 1 . . . 3) where the n smallest pancakes are left out of the heuristic computation. (2) The standard 100 instances of the 15 Puzzle problem (Korf 1985) using the Manhattan Distance heuristic. (3) Grid-based pathﬁnding: 156 maps from Dragon Age Origins (DAO) (Sturtevant 2012), each with different start and goal points (a total of 3150 instances); (4) 50 instances of the 12-disk 4-peg Towers of Hanoi (TOH4) problem with (10+2), (8+4) and (6+6) additive PDBs (Felner, Korf, and Hanan 2004). Table 1 presents results averaged over all instances for a representative set of the heuristics we used. The same trends were observed for other heuristics. The left side of the table is for the base-case while the right side is for the ϵ-case. Four low-level expansion policies were executed until all optimal solutions were found: NBSF , NBSA, DVCBSF and DVCBSA. For comparison reasons we also added A as a baseline. We report the number of nodes expanded at three different points of the execution, each in a different column, as follows. (1) The VC column presents

Domain Heuristic Algorithm

base-case ϵ-case VC: GMX ﬁrst all: GMXA VC: GMXϵ ﬁrst all: GMXAϵ

A* 32 (1.22) 57 941 (1.17) 32 (1.23) 57 941 (1.24) NBSF 49 (1.88) 163 1,338 (1.67) 47 (1.83) 147 1,224 (1.61) NBSA 44 (1.70) 258 1,106 (1.38) 41 (1.57) 310 932 (1.23) DVCBSF 31 (1.18) 106 880 (1.10) 30 (1.14) 121 832 (1.09) DVCBSA 32 (1.24) 191 901 (1.12) 31 (1.18) 284 793 (1.04)

A* 6,410 (1.39) 6,412 81,705 (1.56) 6,404 (1.73) 6,416 81,694 (2.11) NBSF 7,184 (1.55) 7,226 80,192 (1.53) 5,870 (1.59) 5,915 62,374 (1.61) NBSA 5,656 (1.22) 5,705 61,699 (1.18) 4,332 (1.17) 4,527 45,746 (1.18) DVCBSF 5,319 (1.15) 5,341 61,278 (1.17) 4,321 (1.17) 4,344 45,206 (1.17) DVCBSA 4,818 (1.04) 4,886 52,747 (1.01) 3,750 (1.01) 9,955 38,819 (1.00)

A* 322,299 (2.65) 322,378 2,659,657 (3.33) 322,099 (4.15) 322,938 2,659,326 (5.61) NBSF 208,648 (1.71) 209,723 1,393,062 (1.74) 137,295 (1.77) 137,719 842,947 (1.78) NBSA 151,616 (1.24) 152,046 991,354 (1.24) 96,774 (1.25) 99,773 614,320 (1.30) DVCBSF 141,111 (1.16) 141,669 864,611 (1.08) 86,292 (1.11) 87,012 493,288 (1.04) DVCBSA 122,054 (1.00) 122,587 800,105 (1.00) 77,595 (1.00) 168,176 474,315 (1.00)

15 Puzzle MD

NBSF 13,542,536 (N/A) 13,587,955 28,117,879 (N/A) 12,709,517 (N/A) 12,748,107 26,162,236 (N/A) NBSA 12,696,359 (N/A) 12,817,989 24,649,233 (N/A) 11,739,393 (N/A) 12,556,299 22,648,690 (N/A) DVCBSF 11,863,100 (N/A) 11,940,791 25,717,691 (N/A) 11,589,837 (N/A) 11,669,720 24,088,398 (N/A) DVCBSA 11,253,941 (N/A) 11,449,406 23,276,239 (N/A) 10,659,744 (N/A) 11,933,791 21,619,261 (N/A)

Grids DAO Octile

A* 5,322 (1.25) 5,406 5,758 (1.20) 5,322 (1.25) 5,406 5,758 (1.20) NBSF 6,569 (1.54) 6,686 6,952 (1.45) 6,561 (1.54) 6,677 6,942 (1.44) NBSA 6,555 (1.54) 6,888 6,932 (1.44) 6,547 (1.53) 6,880 6,919 (1.44) DVCBSF 5,158 (1.21) 5,546 5,594 (1.16) 5,158 (1.21) 5,545 5,593 (1.16) DVCBSA 5,154 (1.21) 5,547 5,590 (1.16) 5,152 (1.21) 5,546 5,586 (1.16)

A* 276,081 (2.25) 276,089 353,130 (2.28) 276,081 (2.25) 276,089 353,130 (2.28) NBSF 234,165 (1.91) 234,165 291,195 (1.88) 232,509 (1.90) 232,509 288,177 (1.86) NBSA 232,268 (1.89) 232,268 288,583 (1.86) 230,108 (1.88) 230,108 285,073 (1.84) DVCBSF 225,910 (1.84) 225,910 273,210 (1.76) 224,233 (1.83) 224,249 270,715 (1.74) DVCBSA 218,820 (1.78) 218,820 280,800 (1.81) 217,247 (1.77) 219,022 278,286 (1.79)

A* 3,239,287 (4.75) 3,268,093 3,674,518 (4.89) 3,239,287 (5.19) 3,268,093 3,674,518 (5.34) NBSF 731,446 (1.07) 731,522 796,289 (1.06) 663,136 (1.06) 681,995 732,638 (1.07) NBSA 730,562 (1.07) 730,597 795,564 (1.06) 662,424 (1.06) 681,989 732,303 (1.06) DVCBSF 704,213 (1.03) 707,679 766,722 (1.02) 636,375 (1.02) 664,469 695,950 (1.01) DVCBSA 690,389 (1.01) 691,159 757,484 (1.01) 627,983 (1.01) 660,555 690,348 (1.00)

Table 1: Experimental results of average node expansions across domains

the number of nodes expanded until the algorithm reached a VC of the corresponding GMX. The number reported in parenthesis is the ratio (i.e., the relative size) of the discovered VC compared to an oracle (Shaham et al. 2017), that built the entire GMX (by running A in both directions) and found its exact MVC. Numbers close to 1 indicate nearly optimal VCs. Due to memory limits, some MVCs could not be computed (N/A). (2) The ﬁrst column shows the number of nodes expanded until the ﬁrst solution was found and veriﬁed. (3) The all column gives the number of nodes expanded until all optimal solutions were found (i.e., exactly when a VC of GMXA/GMXAϵ is found). Here, the ratio relative to the optimal MVC of GMXA/GMXAϵ is reported. Runtime results are reported in Table 2. The node expansion rates of all variants were similar, with very low variance. Therefore, we use the number of node expansions as the measure in the following analysis of the results. Previous research (Chen et al. 2017; Sturtevant and Felner 2018) reported that NBS tends to outperform and is more robust than A and other related Bi-HS algorithms (e.g., MM). Table 1 conﬁrms that A is not as robust as the LBF family. In some cases, e.g., the 15 puzzle, A failed to solve all instances because memory was exhausted. Except for cases where the heuristic is very good (where MVC might be unidirectional), A s performance is much worse than the LBF family in all three measures. See (Shaham et al. 2017) for a deeper study on the relation between A and MVC.

Since NBS has a 2x bound guarantee, any other algorithm will expand no fewer than half the nodes of NBS, leaving little leeway. Yet, our new algorithms managed to improve upon NBS and the following trends are evident. First, within the NBS family, NBSA and NBSAϵ outperform NBSF and NBSF ϵ, respectively, in terms of ﬁnding a VC of GMX and of GMXA. Moreover, they found the ﬁrst solution faster than NBSF /NBSF ϵ in all cases except GAP and DAO. Second, both DVCBS variants always outperformed the NBS variants in all three measures in the base-case, with DVCBSA almost always being best. In the ϵ-case, DVCBSF outperformed NBSF in all three measures, while DVCBSA outperformed NBSA in VC and all. We note that the VCs discovered by the DVCBS variants were often much closer (e.g., GAP-1; 55% vs. 4%, a factor of 14) to being optimal compared to the VCs discovered by the NBS variants. In fact, in some cases, with a weak heuristic, DVCBSA managed to ﬁnd the exact MVC(!) of GMX (a ratio of 1). Finally, an interesting anomaly occurs with DVCBSAϵ. It was the fastest to reach a VC of GMXϵ but was rarely the fastest to ﬁnd a ﬁrst solution; in such cases DVCBS was best among all algorithms. For example, for GAP-2, DVCBSAϵ expanded 77, 595 nodes to ﬁnd a VC of GMXϵ while DVCBS found a VC after 86, 292 expansions. However, DVCBSAϵ expanded 90, 581 more nodes (totaling 168, 176) before discovering a ﬁrst solution, while DVCBSF ϵ expanded only 720 additional nodes (totaling 87, 012). We conjecture that the

Alg 14 Pancake 15 Puzzle Grids DAO TOH4 A 92,697 N/A 1,821,205 380,325 NBSF 93,176 250,518 1,567,131 402,616 NBSA 98,868 233,166 1,604,500 408,415 DVCBSF 98,448 235,621 1,417,141 418,944 DVCBSA 86,339 259,756 1,457,497 460,368

Table 2: Average node expansions per second

Domain BS MMϵ DVCBSF ϵ A

GAP-0 183 149 121 57 GAP-1 5,262 5,048 4,344 6,416 GAP-2 266,442 119,310 87,012 322,938 10+2 174,936 303,189 224,249 276,089 6+6 1,599,018 1,120,392 664,469 3,268,093 MD 12,001,024 13,162,312 11,669,720 N/A Octile 6,200 7,396 5,545 5,406

Table 3: Average expansions for ﬁrst solution (ϵ-case)

reason is that in the ϵ-case, the frontiers may not be connected (i.e., same node in both frontiers) when a VC is found, and DVCBSAϵ must perform many additional node expansions before connecting the frontiers and ﬁnding a solution. However, other algorithms seem to perform more expansions before ﬁnding a VC, but they are able to connect the frontiers during this process. We intend to study this behavior further in future work. To summarize, DVCBSA is clearly the algorithm of choice (among all 4) when all optimal solutions are needed. When only a ﬁrst solution is needed, DVCBSA is the best in the base-case, while DVCBSF ϵ is the best in ϵ-case. Both always outperform any of the NBS variants, despite not having any theoretical guarantees. We have also compared DVCBSF ϵ (which is our best variant for ﬁnding a ﬁrst solution in the ϵ-case) to A as well as to MMϵ (Holte et al. 2017) and BS (Kwa 1989) which are benchmark Bi-HS algorithms. Table 3 presents the average number of node expansions for ﬁnding a ﬁrst solution in the ϵ-case. As can be seen, DVCBSF ϵ tends to outperform all others, and is certainly the most robust to weaker heuristic.

7 Conclusions and Future Research We have enriched the family of non-parametric Bi-HS algorithms as well as the family of GMX graphs while also focusing on the problem of ﬁnding all optimal solutions. We have shown that our new algorithms outperform existing ones. We aim to look deeper in these directions in the future, and study additional variants and their relative performance.

Acknowledgements This work was supported by Israel Science Foundation (ISF) grant #844/17 to Ariel Felner and Eyal Shimony, by BSF grant #2017692, by NSF grant #1815660 and by the Frankel center for CS at BGU.

References Arthur, J. L.; Hachey, M.; Sahr, K.; Huso, M.; and Kiester, A. 1997. Finding all optimal solutions to the reserve site selection

problem: formulation and computational analysis. Environmental and Ecological Statistics 4(2):153 165. Barley, M. W.; Riddle, P. J.; Linares L opez, C.; Dobson, S.; and Pohl, I. 2018. GBFHS: A generalized breadth-ﬁrst heuristic search algorithm. In So CS, 28 36. Byers, T. H., and Waterman, M. S. 1984. Determining all optimal and near-optimal solutions when solving shortest path problems by dynamic programming. Operations Research 32(6):1381 1384. Chen, J.; Holte, R. C.; Zilles, S.; and Sturtevant, N. R. 2017. Front-to-end bidirectional heuristic search with near-optimal node expansions. In Proceedings of IJCAI. Dechter, R., and Pearl, J. 1985. Generalized best-ﬁrst search strategies and the optimality of A*. J. ACM 32(3):505 536. Eckerle, J.; Chen, J.; Sturtevant, N. R.; Zilles, S.; and Holte, R. C. 2017. Sufﬁcient conditions for node expansion in bidirectional heuristic search. In ICAPS. Felner, A.; Korf, R. E.; and Hanan, S. 2004. Additive pattern database heuristics. J. Artif. Intell. Res. 22:279 318. Helmert, M. 2010. Landmark heuristics for the pancake problem. In So CS. Holte, R. C.; Felner, A.; Sharon, G.; Sturtevant, N. R.; and Chen, J. 2017. MM: A bidirectional search algorithm that is guaranteed to meet in the middle. Artif. Intell. 252:232 266. Isermann, H. 1977. The enumeration of the set of all efﬁcient solutions for a linear multiple objective program. Journal of the Operational Research Society 28(3):711 725. Korf, R. E. 1985. Depth-ﬁrst iterative-deepening: An optimal admissible tree search. Artif. Intell. 27(1):97 109. Kwa, J. B. H. 1989. BS*: An admissible bidirectional staged heuristic search algorithm. Artif. Intell. 38(1):95 109. Mahadevan, R., and Schilling, C. 2003. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metabolic engineering 5(4):264 276. Papadimitriou, C. H., and Steiglitz, K. 1982. Combinatorial optimization: algorithms and complexity. Prentice-Hall, Inc. Shaham, E.; Felner, A.; Chen, J.; and Sturtevant, N. R. 2017. The minimal set of states that must be expanded in a front-to-end bidirectional search. In So CS, 82 90. Shaham, E.; Felner, A.; Sturtevant, N. R.; and Rosenschein, J. S. 2018. Minimizing node expansions in bidirectional search with consistent heuristics. In SOCS, 81 98. Siegmund, F.; Ng, A. H. C.; and Deb, K. 2012. Finding a preferred diverse set of pareto-optimal solutions for a limited number of function calls. In IEEE World Congress on Computational Intelligence (CEC), 2417 2424. Sturtevant, N. R., and Felner, A. 2018. A brief history and recent achievements in bidirectional search. In AAAI. Sturtevant, N. R. 2012. Benchmarks for grid-based pathﬁnding. IEEE Trans. Comput. Intellig. and AI in Games 4(2):144 148.