Extracted Text

12320-56302-1-PB.pdf

Bidirectional Search That Is Guaranteed to Meet in the Middle
Robert C. Holte
Computing Science Dept.
University of Alberta
Edmonton, Canada T6G 2E8
(rholte@ualberta.ca)
Ariel Felner
ISE Department
Ben-Gurion University
Be’er-Sheva, Israel
(felner@bgu.ac.il)
Guni Sharon
ISE Department
Ben-Gurion University
Be’er-Sheva, Israel
(gunisharon@gmail.com)
Nathan R. Sturtevant
Computer Science Dept.
University of Denver
(sturtevant@cs.du.edu)
Abstract
We presentMM, the ﬁrst bidirectional heuristic search algo-
rithm whose forward and backward searches are guaranteed
to “meet in the middle”, i.e. never expand a node beyond
the solution midpoint. We also present a novel framework
for comparingMM, A*, and brute-force search, and identify
conditions favoring each algorithm. Finally, we present ex-
perimental results that support our theoretical analysis.
1 Introduction and overview
Bidirectional search algorithms interleave two separate
searches, a normal search forward from the start state, and a
search backward (i.e. using reverse operators) from the goal.
Barker and Korf (2015)’s comparison of unidirectional
heuristic search (Uni-HS, e.g. A*), bidirectional heuristic
search (Bi-HS), and bidirectional brute-force search (Bi-BS)
has two main conclusions (for caveats, see their Section 3):
BK1:Uni-HS will expand fewer nodes than Bi-HS if
more than half of the nodes expanded by Uni-HS have
g≤C
∗
/2, whereC
∗
is the optimal solution cost.
BK2:If fewer than half of the nodes expanded by Uni-HS
using heuristichhaveg≤C
∗
/2, then addinghto Bi-BS
will not decrease the number of nodes it expands.
A central assumption made by Barker and Korf is that the
forward and backward searches comprising the bidirectional
search never expand a node whoseg-value (in the given di-
rection) exceedsC
∗
/2. We say that a bidirectional search
“meets in the middle” if it has this property.
This assumption raises a difﬁculty in applying their the-
ory, because no known Bi-HS algorithm is guaranteed to
meet in the middle under all circumstances. Papers on
“front-to-front”
1
Bi-HS typically claim their searches meet
Copyrightc∈2016, Association for the Advancement of Artiﬁcial
Intelligence (www.aaai.org). All rights reserved.
1
In bidirectional heuristic search, there are two different ways
to deﬁne the heuristic function (Kaindl and Kainz 1997). A “front-
to-end” heuristic—h
F(n)for the forward search andh B(n)for
the backward search—directly estimates the distance from node
nto the target of the search (the target for the forward search is
the goal, the target for the backward search is the start state). By
contrast a “front-to-front” heuristic estimates the distance fromn
to the search target indirectly. For forward search it is deﬁned as
h
F(n) = min
m∈OpenB
{h(n, m)+g B(m)}whereOpen Bis
in the middle, but none of them has a theorem to this ef-
fect (Areﬁn and Saha 2010; de Champeaux 1983; de Cham-
peaux and Sint 1977; Davis, Pollack, and Sudkamp 1984;
Eckerle 1994; Politowski and Pohl 1984). “Front-to-end”
Bi-HS algorithms (Auer and Kaindl 2004; Ikeda et al. 1994;
Kaindl and Khorsand 1994; Kwa 1989; Pohl 1969; Sad-
hukhan 2012) are not speciﬁcally designed to meet in the
middle, and often they do not. For example, in Barker and
Korf’s Towers of Hanoi experiment BS* (Kwa 1989) often
expanded nodes at depth 13 in each direction even though
the solution lengthsC
∗
were at most 16.
To remedy this we present a new front-to-end Bi-HS al-
gorithm,MM, that is guaranteed to meet in the middle.MM
0
is the brute-force (h( s)=0∀s) version ofMM. We also
present a new framework for comparingMM
0, unidirectional
brute-force search (Uni-BS),MM, and A* that allows a pre-
cise characterization of the regions of the state space that
will be expanded by one method but not another. We use
this to identify conditions under which one method will ex-
pand fewer nodes than another, and conditions guaranteeing
BK1’s correctness. We also show that, unlike unidirectional
search, adding a non-zero heuristic (=0 for every non-goal
node) to Bi-BS can cause it to expand more nodes. For ex-
ample,MMexpands 4 times more nodes thanMM
0in one of
our Pancake Puzzle experiments. Overall, our experiments
on the Pancake Puzzle and Rubik’s Cube show that the al-
gorithm expanding the fewest nodes could be any one of
Uni-HS,MM
0orMM, depending on the heuristic used.
Although we introduce a new algorithm (MM), we do not
present an experimental comparison ofMMto existing Bi-HS
algorithms, and we do not claim thatMM
0andMMare the best
bidirectional search algorithms in terms of minimizing run
time or the number of nodes expanded. These issues are out-
side the scope of this paper. Like the Barker and Korf paper,
this paper is theoretical.MM’s signiﬁcance is that it is the
only Bi-HS algorithm to which our analysis, and Barker and
Korf’s, applies. These theories give strong justiﬁcation for
bidirectional search algorithms that meet in the middle. As
the ﬁrst of its breed,MMrepresents a new direction for de-
veloping highly competitive Bi-HS algorithms. A thorough
the backward search’s open list,h(n, m) is a function estimating
the distance between any two nodes, andg
B(m)is theg-value of
nodemin the backward search. A front-to-front heuristic for the
backward search is deﬁned analogously.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16)

Figure 1: Diagrammatic depiction of the different regions.
empirical evaluation ofMMis an important study that we will
undertake in the future.
2 Deﬁnitions and terminology
A problem instance is a pair(start, goal)of states in a state-
space in which all edge weights are non-negative. The aim of
search is to ﬁnd aleast-cost pathfromstarttogoal.d(u, v)
is the distance (cost of a least-cost path) from stateuto state
v.C
∗
=d(start, goal)is the cost of an optimal solution.
We use the usual notation—f,g,Open,etc.—and use
gminandfminfor the minimumg- andf-value onOpen.
We have separate copies of these variables for the two search
directions, with a subscript (F or B) indicating the direction:
Forward search:f
F,gF,hF,OpenF,ClosedF, etc.
Backward search:f
B,gB,hB,OpenB,ClosedB, etc.
We assume that each search direction uses an admissible
front-to-end (Kaindl and Kainz 1997) heuristic.
We say statesis “near togoal”ifd(s, goal)≤C
∗
/2,
and “far fromgoal” otherwise. Forstart, we make a 3-way
distinction:sis “near tostart”ifd(start, s)≤C
∗
/2,“far
fromstart”ifC
∗
/2<d(start, s)≤C
∗
, and “remote” if
d(start, s)>C
∗
. These categories divide the state-space
into 6 disjoint regions shown in Figure 1. We denote these
regions by two letter acronyms. The ﬁrst letter indicates the
distance fromstart(N=near, F=far, R=remote) and the sec-
ond letter indicates the distance fromgoal(N=near, F=far).
For example, FN is the set of states that are far fromstart
and near togoal. NN includes only those states at the exact
midpoint of optimal solutions. None of the search algorithms
in this paper expands a state in RF.
In Sections 4–7 we compareMM
0,MM, Uni-BS, and A*
based mainly on the nodes they expand in each region. A re-
gion’s name denotes both the set of states and the number of
states in the region. We will use the names in equations and
inequalities. An inequality involving two algorithms, e.g. A*
<MM, indicates that one algorithm (A* in this example) ex-
pands fewer nodes than the other.
3MM: A Novel Bi-HS Algorithm
MMruns an A*-like search in both directions, except thatMM
orders nodes on the Open list in a novel way. The priority of
nodenonOpen
F,prF(n), is deﬁned to be:
pr
F(n)=max(f F(n),2g F(n)).
pr
B(n) is deﬁned analogously. We useprmin Fand
prmin
Bfor the minimum priority onOpen FandOpen B,
Algorithm 1:Pseudocode forMM
1gF(start):=g B(goal):=0;Open F:={start};
Open
B:={goal};U:=∞
2while(Open F=∅)and(Open B=∅)do3 C:=min(prmin F,prminB)
4 ifU≤max(C, fmin F,fminB,gminF+gmin B+∗)
then
5 returnU
6 ifC=prmin Fthen
7 // Expand in the forward direction
8 choosen∈Open Ffor whichpr F(n)=prmin F
andg F(n)is minimum
9 movenfromOpen FtoClosedF
10
foreachchildcofndo11 ifc∈Open F∪ClosedFand
g
F(c)≤g F(n)+cost(n, c)then12 continue
13 ifc∈Open F∪ClosedFthen
14 removecfromOpen F∪ClosedF
15 gF(c):=g F(n)+cost(n, c)
16 addctoOpen F
17
ifc∈Open Bthen18 U:=min(U,g F(c)+g B(c))
19 else
20 // Expand in the backward direction, analogously
21return∞
respectively, andC=min(prmin F,prminB). On each it-
erationMMexpands a node with priorityC.Uis the cost
of the cheapest solution found so far. Initially inﬁnite,Uis
updated whenever a better solution is found.MMstops when
U≤max(C, fmin
F, fminB, gminF+gminB+∗)
where∗is the cost of the cheapest edge in the state-space.
Each of the last three terms inside the max is a lower
bound on the cost of any solution that might be found by continuing to search. Therefore, ifUis smaller than or equal
to any of them, its optimality is guaranteed andMMcan safely
stop. It is safe to stop whenU≤CbecauseC≤C
∗
until
all optimal solutions have been found (Theorem 10 in (Holte et al. 2015)). Therefore,U≤CimpliesU=C
∗
.
MMhas the following properties:
(P1)MM’s forward (backward) search never expands a state
remote or far fromstart(goal), i.e. its forward and back-
ward searches meet in the middle.
(P2)MMnever expands a node whosef-value exceedsC
∗
.
(P3)MMreturnsC
∗
.
(P4)If there exists a path fromstarttogoalandMM’s
heuristics are consistent,MMnever expands a state twice.
Holte et al. (2015) gives complete proofs thatMMhas these
properties. Here we provide a sketch of the proof of P1,
MM’s distinctive property.2g(n)is larger thanf(n)when
g(n)>h(n).Ifnis remote or far fromstart(goal) this
makespr
F(n)>C
∗
(prB(n)>C
∗
). Ifmis near tostart
(goal) on any optimal path,pr
F(m)≤C
∗
(prB(m)≤C
∗
).
Thus, an optimal solution will be found beforeMMexpands
any node that is remote or far fromstartand far fromgoal.3412

Figure 2:MM 0need not expand all nodes withg F(n)<
C
∗
/2org B(n)<C
∗
/2.
The2g(n)term also guarantees that a node that is remote
or far fromstart(goal) but near togoal(start) will be ex-
panded byMM’s backward (forward) search (if it is expanded
at all). The2g(n)term inpr(n)therefore guarantees prop-
erty P1.
Algorithm 1 gives pseudocode forMM. Lines 2–20 are the
usualbest-ﬁrst searchexpansion cycle. Duplicate detection
is in line 11.Uis updated in line 18 and checked in line 4.
Here, only the cost of the optimal path is returned (line 5). It
is straightforward to add code to get an actual solution path.
Whenprmin
F=prmin Bany rule could be used to
break the tie (e.g. Pohl (1969)’s cardinality criterion). How-
ever, to exploit thegmin
F+gmin B+≤stopping condi-
tion, it is advantageous to continue to expand nodes in the
same direction (in line 6 ties are broken in favor of forward
search) until there is no longer a tie orgminin that direction
has increased, and to break ties among nodes with the same
priority in the chosen search direction in favor of smallg.
3.1MM 0
MM0is the brute-force version ofMM, i.e.MMwhenh(n)=
0∀n. Thus forMM
0:prF(n)=2 g F(n)andpr B(n)=
2g
B(n).MM 0is identical to Nicholson’s (1966) algorithm
except that Nicholson’s stops whenU≤gmin
F+gminB,
whereasMM
0stops whenU≤gmin F+gminB+≤.
4MM 0compared to Uni-BS
We now use the region-based framework introduced in Sec-
tion 2 to analyze under what conditions one type of algo-
rithm will expand more nodes than another. The analysis
will be made on a region-by-region basis, since, as we will
see, in all cases except Uni-HS vs. Uni-BS no algorithm-
type is superior to any other in all regions. We will summa-
rize these analyses with three general rules (GR1, GR2, and
GR3). These are general expectations, not iron-clad guaran-
tees. There are many factors in play in a given situation, each
factor may favor a different algorithm-type. It is the net sum
of these factors that ultimately determines which algorithm-
type outperforms another. Our general rules state what we
expect will usually be the dominant forces.
We begin by analyzing the brute-force algorithms since
this lays the foundation for the subsequent comparisons.
Uni-BS only expands nodes that are near to or far from
start. We write this as the equation:
(Eq. 1)Uni-BS = NF + NN + F
∀
N+F
∀
F
Figure 3: State space in which NN is large.
F
∀
indicates that Uni-BS might not expand all the nodes that
are far fromstart. For example, Uni-BS will usually not
expand all nodes that are exactly distanceC
∗
fromstart.
By contrast, Uni-BS must expand all nodes near tostart.
MM
0only expands nodes that are near tostartor togoal:
(Eq. 2)MM
0=N
∀
F+N
∀
N
∀
+FN
∀
+RN
∀
.
N
∀
indicates thatMM 0might not expand all the nodes that
are near tostartorgoal. For example, in unit-cost spaces
whenC
∗
is even,MM 0will not expand any node in NN be-
cause an optimal path will be known by the time all the nodes distanceC
∗
/2−1fromstartandgoalhave been
expanded. The≤in thegmin
F+gmin B+≤termination
condition means thatMM
0can terminate before some nodes
withg
F(n)<C
∗
/2org B(n)<C
∗
/2have been ex-
panded. This is illustrated in Figure 2; the numbers in the nodes are discussed in Sections 5 and 7, they may be ig- nored for now.S
i(Gi) is the layer of nodes at depthiin
the tree rooted atstart(goal). AfterMM
0expandsstartand
goal,AandS
1will be inOpen F, andCandG 1will be
inOpen
B, all withg=1. Since ties are broken in favor
of the forward direction,MM
0will next expandAandS 1,
generatingBandS
2withg F=2. It will then switch di-
rections and expandCandG
1in some order. As soon as
Cis expanded a solution costingU=4is found. Since
gmin
F+gminB+1=2+1+1 ≥U,MM 0can stop. This
may happen before some nodes inG
1are expanded even
though they are distance1fromgoalandC
∗
/2=2.
Uni-BS expands more nodes thanMM
0iff (Eq. 1>Eq. 2)
(Eq. 3)NF+NN+F
∀
N+F
∀
F>N
∀
F+N
∀
N
∀
+FN
∀
+RN
∀
To identify thecore differencesbetween the algorithms, i.e.
regions explored by one algorithm but not the other, we ig- nore the difference between N and N
∀
and between F and F
∀
,
which simpliﬁes Eq. 3 to:
(Eq. 4)FF>RN
We have identiﬁed two conditions that guarantee FF>RN:
(1)WhenC
∗
=D, the diameter of the space, there are no
remote states, by deﬁnition, so RN is empty. (2)When the number of states far fromstartis larger than
the number of states near togoal, i.e. if FF + FN>FN+NN
+ RN, or equivalently, FF>RN + NN. We say a problem
(start, goal)isbi-friendlyif it has this property.
A special case of bi-friendly problems occurs when the

number of states distancedfromstartis the same as the
number of states distancedfromgoal, for alld≤C
∗
. This
occurs often in standard heuristic search testbeds, e.g. the
Pancake Puzzle, Rubik’s Cube, and the Sliding Tile Puzzle
when the blank is in the same location type (e.g. a corner) in
bothstartandgoal. In such cases, a problem is bi-friendly
if the number of states near tostartis less than the num-
ber of states far fromstart, i.e. more than half the states at
depthsd≤C
∗
occur after the solution midpoint. This is
similar to the condition in BK1 withh(s)=0∀s. In many
testbeds this occurs because the number of states distance
dfrom any state continues to grow asdincreases untildis
well pastD/2. For example, Rubik’s Cube hasD=20and
the number of states at distancedonly begins to decrease
whend=19(Table 5.1, (Rokicki et al. 2013)).
Non-core differences (NF, NN, FN) can sometimes cause
large performance differences. The example in Figure 3 ex-
ploits the fact that Uni-BS always expands all nodes in NN
butMM
0sometimes does not. All edges cost 1.startand
goaleach have one neighbor (s andgrespectively) that are
roots of depthdbinary trees that share leaves (the middle
layer, which is NN).C
∗
=2d+2and all paths fromstart
togoalare optimal. FF and RN are empty. The values on the
ﬁgure’s left may be ignored for now, they are used in Sec-
tion 5.MM
0expands all the nodes except those in the middle
layer, for a total of2·2
d
nodes expanded. Uni-BS will ex-
pand all the nodes exceptgoal, for a total of3·2
d
–1nodes,
1.5 times as many asMM
0. This ratio can be made arbitrarily
large by increasing the branching factor of the trees.
The general rule based on the core differences is:
GR1: FF and RN usually determine whetherMM
0will
expand fewer nodes than Uni-BS (FF>RN) or more.
5MM 0compared to A*
A heuristic,h, splits each region into two parts, the states in
the region that are pruned byh, and the states that are not
pruned. For example, FNU is the unpruned part of FN. The
set of states expanded by A* is therefore (modiﬁed Eq. 1):
(Eq. 5)A* = NFU + NNU + FNU + FFU
We ﬁrst compare the ﬁrst three terms to the corresponding
terms in Eq. 2 forMM
0and then compare FFU and RN
∀
.
Region NF:We expect A* to expand many nodes in NF.
These nodes haveg
F(n)≤C
∗
/2so A* would prune them
only ifh
F(n)>C
∗
/2. One might expectMM 0’s N
∀
Fto
be larger than A*’s NFU because A* prunes NF with a
heuristic. This underestimates the power of thegmin
F+
gmin
B+≤termination condition, which can cause N
∀
Fto
be much smaller than NFU. In Figure 2, a number with a
right-pointing arrow over it inside nodenish
F(n). Not
shown areh
F(C)=1andh F(s)=1∀s∈S 3. Region
NF containsstart,A,S
1andS 2. The heuristic does no
pruning in this region so these are all expanded by A*.MM
0
will not expand any nodenwithg F(n)=C
∗
/2(e.g.S 2)
so N
∀
F is half the size of NFU. As a second example, on
Rubik’s Cube instances withC
∗
=20,MM 0only expands
nodes withg
F(n)≤9because of this termination condi-
tion. Korf (1997)’s heuristic has a maximum value of11,so
A* with this heuristic will not prune any nodes in N
∀
F. In
general, we do not expect A* to have a large advantage over
MM
0in NF unless its heuristic is very accurate.
2
Region NN:As discussed above,MM 0might expand no
nodes in NN (i.e. N
∀
N
∀
is empty). Nodes in NN have
g
F(n)=g B(n)=C
∗
/2, so A*’sf(s)cannot exceed
C
∗
. Therefore, even with an extremely accurate heuristic,
A* may do little pruning in NN. For example, the heuristic
values shown on the left side of Figure 3 are consistent and
“almost perfect” (Helmert and R¨oger 2008) yet they produce
no pruning at all. A* behaves exactly the same as Uni-BS
and expands 1.5 times as many nodes asMM
0.
Region FN:We expect A* to expand far fewer nodes than
MM
0in FN. These nodes haveg F(n)>C
∗
/2and, being
relatively close togoal, we expect the heuristic values for
these nodes to be very accurate.
FFU vs RN
∀
:RN
∀
certainly can be much smaller than FFU.
In Figure 2, RN (G
1+G2) is about the same size as FF
(S
3), which is the same as FFU in this example. However,
becauseMM
0will not expand any nodes withg B(n)=C
∗
/2
in this example, RN
∀
is half the size of RN (RN
∀
contains
G
1but notG 2), soMM 0expands many fewer nodes in RN
than A* does in FF. On the other hand, FFU will certainly
be the same size as or smaller than RN
∀
with a sufﬁciently
accurate heuristic. In the extreme case, when RN
∀
is empty,
this requires a heuristic that prunes every node in FF. This is
not impossible, since no optimal path passes through FF, but
it does require an extremely accurate heuristic. Moreover, FF
without any pruning can be much smaller than RN
∀
. Deleting
S
3from Figure 2 makes FF empty, while RN
∀
can be made
arbitrarily large.
The general rule based on the core differences is:
GR2: When FF>RN, A* will expand more nodes in
FF thanMM
0expands in RN unless A*’s heuristic is very
accurate.
6MMcompared to A*
Modifying Eq. 2, the equation forMMis:
(Eq. 6)MM=N
∀
FU+N
∀
N
∀
U+FN
∀
B+RN
∀
B.
B has the same meaning as U, but is based onh
B, the
heuristic ofMM’s backwards search. For example, FNB is
the part of FN that is not pruned byh
B. In general, FNB
will be different than FNU, the part that is not pruned by
h
F, the heuristic used by A*. By deﬁnition, N
∀
FU≤NFU
and N
∀
N
∀
U≤NN, soMMhas an advantage over A* in NF
and NN.
Region FN:FNU is almost certainly smaller than FN
∀
B be-
cause in forward search, nodes in FN haveg
F(n)>C
∗
/2
andh
Fis estimating a small distance (at mostC
∗
/2).
By contrast, for the backwards search, nodes in FN have
g
B(n)≤C
∗
/2andh Bwould need to accurately estimate
a distance larger thanC
∗
/2to prune them. So, A* has an
advantage overMM’s backward search in FN.
2
We recognize the imprecision in terms like “very accurate”,
“inaccurate” etc. We use these qualitative gradations to highlight
that as the heuristic’s accuracy increases or decreases, the advan-
tage shifts from one algorithm to another.3414

FFU vs RNB:Not much pruning will usually occur dur-
ingMM’s backward search in RN because RN’sg
B-values
are small and the distances being estimated byh
Bare large.
However, if RN is much smaller than FF buth
Fis accurate
enough to make FFU smaller thanMM
0’s RN
∗
(see the dis-
cussion of FFU vs RN
∗
in Section 5), then we expect thath B
will be accurate enough to do some pruning in RN. Thus, we
expect FFU>RNB whenever RN is much smaller than FF.
The general rule based on this section’s analysis is the
same as GR2 withMM
0replaced byMM.
6.1 The Correctness of BK1
In our notation BK1 (page 1) is written as:
FNU + FFU<NNU + NFU=⇒Uni-HS<MM.
Here FNU + FFU is the number of nodes expanded by Uni-
HS beyond the solution midpoint, NNU + NFU is the num-
ber expanded at or before the solution midpoint.
Combining Eqs. 5 and 6 gives the exact expression:
(Eq. 7)Uni-HS<MM⇐⇒NFU + NNU + FNU + FFU
<N
∗
FU+N
∗
N
∗
U+FN
∗
B+RN
∗
B.
Differences between Eq. 7 and BK1 represent situations in
which BK1 will be incorrect. For example, BK1 ignores
region RN, but it can be the decisive factor determining
whetherMMexpands more nodes than Uni-HS.
On the other hand, it is easy to prove (Holte et al. 2015)
that BK1 will make correct predictions if the following con-
ditions all hold:(C1)NN, FNU, and FFU are all negligible
in size compared to NFU;(C2)FN
∗
B+RN
∗
B≈N
∗
FU;(C3)
NFU/2 <N
∗
FU.
C1–C3 hold in our experiment on the Pancake Puzzle with
the GAP heuristic (see the GAP rows for A* andMMnear the
bottom of Table 1) and we conjecture they held in Barker and
Korf’s experiments.
7MM 0compared toMM: an anomaly
Ifh1andh 2are admissible heuristics andh 1(s)>h 2(s)
for all non-goal nodes, then A* cannot expand more distinct
nodes withh
1than withh 2(Nilsson 1982). In particular, A*
with a non-zero heuristic cannot expand more nodes than
Uni-BS.
This is not necessarily true forMMor most Bi-HS algo-
rithms. In Figure 2 the value in a node is itsh-value in
the direction indicated by the arrow. All nodes in layerS
3
(G3)haveh F(s)=1(h B(s)=1).MMexpands all the
nodes inS
1andG 1because they havepr(s)=3while
pr
F(A)=pr B(C)=4.MMmight then expand any number
of nodes inS
2orG 2since they too havepr(s)=4.
3
By
contrast, we saw (Section 4) thatMM
0could stop before ex-
panding all the nodes inS
1andG 1and would never expand
a node inS
2orG2. Thus we see thatMM 0can expand strictly
fewer nodes thanMMwith a consistent, non-zero heuristic.
This example mimics behavior we saw with the GAP-2
and GAP-3 heuristics in the Pancake puzzle experiments be-
low. We believe it occurs commonly with heuristics that are
very accurate near the goal but inaccurate elsewhere. This
3
Bi-HS algorithms that strictly alternate search direction or use
the cardinality criterion to choose the direction will expand all the
nodes inS
2andG 2.
example reveals a fundamental dilemma for Bi-HS caused
by a tension between its two main stopping conditions:
S1:U≤max(fmin
F, fminB)
S2:U≤gmin
F+gminB+∈.
To satisfy S1 as quickly as possible, a node with minimum
f-value should be expanded, but to satisfy S2 as quickly
as possible a node with minimumg-value should be ex-
panded. These two node-selection rules will disagree if none
of the nodes with the smallestf-value also have the small-
estg-value. Breaking ties among nodes with the samef-
value in favor of largeg-values also causes the two selec-
tion rules to make different choices. When the two selection
rules disagree, a choice has to be made as to which stopping
condition to favor.MMand all previous Bi-HS methods are
hard-coded to favor S1. Bi-BS methods must favor S2, since
they have no heuristic. In situations like Figure 2, where S2
can be satisﬁed more quickly than S1, Bi-BS will outper-
form Bi-HS. Identifying conditions under which S2 is more
quickly satisﬁed than S1 is an important direction for future
research. For now, we offer the following conjecture:
GR3: With an inaccurate heuristic, Bi-HS will expand
more nodes than Bi-BS.
8 Experiments
The purpose of the experiments in this section is to verify the
correctness our general rules (GR1–GR3). Since some rules
refer to the sizes of certain regions, they could only be tested
in domains small enough to be fully enumerated. Likewise,
since some rules refer to a heuristic’s relative accuracy, we
used at least two heuristics of different accuracy in each do-
main. All heuristic used in these experiments were consis-
tent, not just admissible. The two domains used in our study
are the 10-Pancake Puzzle and Rubik’s Cube. In both do-
mains all problems are bi-friendly. Because GR1–GR3 make
predictions about the number of nodes expanded, that is the
only quantity we measure in our experiments.
8.1 10-Pancake Puzzle
We ranMM 0,MM, Uni-BS, and A* on 30 random instances
for each possible value ofC
∗
(1≤C
∗
≤11). We used
the GAP heuristic (Helmert 2010) and derived less accurate
heuristics from it, referred to as GAP-X, by not counting the
gaps involving any of the X smallest pancakes. For example,
GAP-2 does not count the gaps involving pancakes 0 or 1.
The trends reported below were similar for all values of
C
∗
. In addition, similar trends were obtained using a pattern
database (PDB) based on 6-X pancakes. Table 1 shows the
number of nodes expanded in each region for each algorithm
using each heuristic forC
∗
=10.
4
Row “Reg. Size” shows
the number of states in each region. Column “Total” is the
total of all the columns to its right. The total for Region Size
does not include region RF (it is not in the table because
none of the algorithms expand nodes in RF).
4
We present results forC
∗
=10because the trends reported
below were valid forall30 instances withC
∗
=10. There were a
few instances forC
∗
∈{6,7,9}that were exceptions. But, most
trends were clearly seen for these values ofC
∗
too.3415

TotalNFNN FF FNRN
Reg. Size3,555,95527,390553,501,12027,003387
Brute-force searches
Uni-BS1,743,54827,390551,704,02712,0770
MM0 5,5514,6200 0 91714
GAP-3
A* 97,64417,34655 75,4314,8120
MM 7,5074,0950 03,40211
MM-2g 106,53917,44655 78,73810,28911
GAP-2
A* 27,1629,96455 14,1912,9520
MM 6,7233,3110 03,40211
MM-2g 39,45310,25555 19,5429,59011
GAP-1
A* 4,2802,61155 852 7610
MM 2,4481,3500 01,0971
MM-2g 5,9672,66855 1,1312,1131
GAP
A* 117 9112 1 130
MM 165 880 0 770
MM-2g 165 880 0 770
Table 1: 10 pancake results: average nodes expansions by
region for instances withC
∗
=10.
We see that RN is small and FF is very large. Although we
regard GAP-3 as a inaccurate heuristic it does prune almost
all the nodes in FF. NF is identical in size to FN+RN because
of the symmetry in this space. The asymmetry ofMM
0’s ex-
pansions in NF and FN+RN is because, forC
∗
=10,MM 0
must expand all the nodes withg(s)=4in one direction
but not the other.MM’s expansions in these regions are much
more balanced. A*’s total is largely determined by NF with
the more accurate heuristics, but is determined by FF with
the less accurate heuristics. The bold results show that de-
pending onhthe algorithm expanding the fewest nodes is
A* (GAP),MM(GAP-1), orMM
0(GAP-2, GAP-3).
To examine the effect of the2gterm inMM’s deﬁnition
of a node’s priority, we ran an altered version ofMM, called
MM-2g, omitting the2gterm in the deﬁnition ofpr(n),so
noden’s priority is the usualf(n). We also added code to
preventMM-2gfrom expanding the same node in both di-
rections. Although not identical to any existing Bi-HS al-
gorithm, we believeMM-2g’s results are representative of
bidirectional search algorithms that do not meet in the mid-
dle. For all of the heuristics,MM-2gexpands many nodes in
FF and many more nodes thanMMin NF, NN, and FN.
GR1, GR2, and GR3 are all conﬁrmed by this experiment.
GR1:For every instance for every value ofC
∗
,FF>RN
andMM
0expanded fewer nodes than Uni-BS.
GR2:A* expands more and more nodes in FF as the
heuristic becomes less accurate, whileMMandMM
0always
expand less than half the nodes in RN. This trend holds
for individual instances, not just for averages. On all 30 in-
stances A* with the GAP heuristic does not expand more
nodes in FF thanMM
0expands in RN, and the opposite is
true for all instances when A* uses the other heuristics.MM
is similar – with all heuristicsMMexpands fewer nodes in
RN than A* does in FF on all instances.
h1997 h888
#dMM0 MMIDA* MM IDA*
116218M166M260M96.0M18.7M
2171.55B1.00B1.51B1.01B114M
3171.55B1.14B8.13B1.02B676M
4171.55B0.96B6.56B387M467M
5182.88B4.71B29.7B3.58B2.49B
6182.88B4.84B15.4B3.51B1.10B
7182.88B5.89B41.6B4.01B3.16B
8182.88B4.84B45.9B3.67B3.77B
9182.88B3.01B58.4B2.87B5.13B
10182.88B4.25B70.3B3.29B4.82B
Table 2: Rubik’s Cube results. M=million, B=billion.
GR3:With the best heuristic, GAP,MMexpands many
fewer nodes thanMM
0. As the heuristic becomes less accu-
rate, the difference betweenMMandMM
0steadily diminishes
and eventually (GAP-2) turns into a steadily growing advan- tage forMM
0. This trend holds for individual instances too.
8.2 Rubik’s Cube
We use two heuristics in this study:h 888is the more accu-
rate, using two 8-edge PDBs and the 8-corner PDB.h
1997is
the heuristic used to ﬁrst optimally solve random instances of Rubik’s Cube (Korf 1997). It is based on two 6-edge PDBs and the 8-corner PDB.
The Uni-HS algorithm is IDA* with the standard opera-
tor pruning rules for Rubik’s Cube (Korf 1997). Our imple- mentations ofMMandMM
0both use external memory. A full
description of these implementations is outside of the scope of this paper, but both algorithms are based on delayed du- plicate detection (Korf 2004). OurMM
0implementation ex-
pands a fullg-layer in one direction, and then removes dupli-
cates and checks for a solution. As a result, it always expands the maximum number of states in a layer before complet- ing. OurMMimplementation has priority-based open- and
closed-lists stored across two 500GB SSD disks. States with the same priority are found in the same ﬁle; before a set of states is expanded, duplicate detection against the closed list is performed. Then, solution detection is performed in par- allel to expanding states in the ﬁle. Because we only check for solutions when expanding a ﬁle, there can be a signiﬁ- cant delay between when the full solution is generated and detected. Improving this is an issue for future work. Because operator pruning is, in general, unsafe to use in conjunction with duplicate detection (Holte and Burch 2014),MMand
MM
0did no operator pruning.
Table 2 shows the results on each of the ten standard test
instances (Korf 1997).MM
0expands fewer nodes than IDA*
withh
1997on all instances except for instance #2 where
there was a very small gap. Due to tie-breaking within the
last iteration of IDA*, the differences on instances #1 and
#2 are not meaningful for either algorithm. This is consis-
tent with GR2 becauseh
1997is not especially accurate.
Withh
1997MMalways expands fewer nodes than IDA*.
In fact,MMwithh
1997expands fewer nodes than IDA* with
the superiorh
888on instances #9 and #10.MMexpands fewer
nodes thanMM
0on the easier instances (d =17) but more3416

on the harder ones (d =18). There are two possible expla-
nations for this. The ﬁrst is the anomaly phenomenon de-
scribed in Section 7. A heuristic that is sufﬁciently accurate
forMMto expand fewer nodes thanMM
0on easier instances
might not be sufﬁciently accurate on harder instances. The
second is related to the delayed duplicate (and solution) de-
tection. If we performed solution detection earlierMMwould
have certainly improved. But earlier solution detection in
MM
0could also improve its performance. Future work will
study termination conditions in external memory search. For
instance, an in-memory version ofMMexpanded only 75M
nodes on problem #1, while tie-breaking in the order of ﬁle
expansion for external-memoryMMcan signiﬁcantly worsen
its performance. The IDA* code expands more nodes per
second thanMM, but for instances #3-#10MMfound solutions
in less time than IDA*.
h
888is accurate enough (as is GAP on the Pancake puz-
zle) for IDA* to outperform theMMvariants for the easier in-
stances #1 (d =16) but theMMvariants expand fewer nodes
on the harder instances becauseh
888is not sufﬁciently ac-
curate on them.
9 Conclusions and future work
In this paper we introducedMM, the ﬁrst Bi-HS algorithm
guaranteed to meet in the middle. We also introduced a
framework that divides the state-space into disjoint regions
and allows a careful analysis of the behavior of the differ-
ent algorithms in each of the regions. We studied the var-
ious types of algorithms and provided some general rules
that were conﬁrmed by our experiments.
This paper initiated this direction. Future work will con-
tinue as follows:(1)A deeper analysis on current and new
MMvariants may further deepen our knowledge in this issue.
(2)A thorough experimental comparison should be done
on more domains and with more bidirectional search algo-
rithms.(3)Heuristics that are speciﬁcally designed forMM,
i.e., that only return values larger thanC
∗
/2are needed.
10 Acknowledgements
Thanks to Joseph Barker for answering questions and pro-
viding extra data related to (Barker and Korf 2015) and to
Sandra Zilles and Andr´e Grahl Pereira for suggesting im-
provements in the theoretical analysis ofMM. Financial sup-
port for this research was in part provided by Canada’s Nat-
ural Science and Engineering Research Council (NSERC)
and by Israel Science Foundation (ISF) grant #417/13. Com-
putational facilities for some of our experiments were pro-
vided by Compute Canada. This material is based upon work
supported by the National Science Foundation under Grant
No. 1551406.
References
Areﬁn, K. S., and Saha, A. K. 2010. A new approach of iterative
deepening bi-directional heuristic front-to-front algorithm (IDB-
HFFA).International Journal of Electrical and Computer Sciences
(IJECS-IJENS)10(2).
Auer, A., and Kaindl, H. 2004. A case study of revisiting best-
ﬁrst vs. depth-ﬁrst search. InProc. 16th European Conference on
Artiﬁcial Intelligence (ECAI), 141–145.
Barker, J. K., and Korf, R. E. 2015. Limitations of front-to-end
bidirectional heuristic search. InProc. 29th AAAI Conference on
Artiﬁcial Intelligence, 1086–1092.
Davis, H. W.; Pollack, R. B.; and Sudkamp, T. 1984. Towards
a better understanding of bidirectional search. InProc. National
Conference on Artiﬁcial Intelligence (AAAI), 68–72.
de Champeaux, D., and Sint, L. 1977. An improved bidirectional
heuristic search algorithm.J. ACM24(2):177–191.
de Champeaux, D. 1983. Bidirectional heuristic search again.J.
ACM30(1):22–32.
Eckerle, J. 1994. An optimal bidirectional search algorithm. In
Proc. KI-94: Advances in Artiﬁcial Intelligence, 18th Annual Ger-
man Conference on Artiﬁcial Intelligence, 394.
Helmert, M., and R¨oger, G. 2008. How good is almost perfect? In
Proc. 23rd AAAI Conference on Artiﬁcial Intelligence, 944–949.
Helmert, M. 2010. Landmark heuristics for the pancake problem.
InProc. 3rd Annual Symposium on Combinatorial Search, (SoCS).
Holte, R. C., and Burch, N. 2014. Automatic move pruning for
single-agent search.AI Communications27(4):363–383.
Holte, R. C.; Felner, A.; Sharon, G.; and Sturtevant, N. R. 2015.
Bidirectional search that is guaranteed to meet in the middle: Ex-
tended Version. Technical Report TR15-01, Computing Science
Department, University of Alberta.
Ikeda, T.; Hsu, M.-Y.; Imai, H.; Nishimura, S.; Shimoura, H.;
Hashimoto, T.; Tenmoku, K.; and Mitoh, K. 1994. A fast algorithm
for ﬁnding better routes by AI search techniques. InProc. Vehicle
Navigation and Information Systems Conference, 291–296.
Kaindl, H., and Kainz, G. 1997. Bidirectional heuristic search
reconsidered.J. Artiﬁcial Intelligence Resesearch (JAIR)7:283–
317.
Kaindl, H., and Khorsand, A. 1994. Memory-bounded bidirec-
tional search. InProc. 12th National Conference on Artiﬁcial In-
telligence (AAAI), 1359–1364.
Korf, R. E. 1997. Finding optimal solutions to Rubik’s Cube using
pattern databases. InProc. 14th National Conference on Artiﬁcial
Intelligence (AAAI), 700–705.
Korf, R. E. 2004. Best-ﬁrst frontier search with delayed duplicate
detection. InProc. 19th National Conference on Artiﬁcial Intelli-
gence (AAAI), 650–657.
Kwa, J. B. H. 1989. BS*: An admissible bidirectional staged
heuristic search algorithm.Artiﬁcial Intelligence38(1):95–109.
Nicholson, T. A. J. 1966. Finding the shortest route between two
points in a network.The Computer Journal9(3):275–280.
Nilsson, N. J. 1982.Principles of Artiﬁcial Intelligence. Springer.
Pohl, I. 1969. Bi-directional and heuristic search in path problems.
Technical Report 104, Stanford Linear Accelerator Center.
Politowski, G., and Pohl, I. 1984. D-node retargeting in bidirec-
tional heuristic search. InProc. National Conference on Artiﬁcial
Intelligence (AAAI), 274–277.
Rokicki, T.; Kociemba, H.; Davidson, M.; and Dethridge, J. 2013.
The diameter of the Rubik’s Cube group is twenty.SIAM J. Dis-
crete Math.27(2):1082–1105.
Sadhukhan, S. K. 2012. A new approach to bidirectional heuristic
search using error functions. InProc. 1st International Conference
on Intelligent Infrastructure at the 47th Annual National Conven-
tion COMPUTER SOCIETY of INDIA (CSI-2012).3417