# improved_anonymous_multiagent_path_finding_algorithm__988e575a.pdf

Improved Anonymous Multi-Agent Path Finding Algorithm

Zain Alabedeen Ali1, Konstantin Yakovlev2, 3

1 Moscow Institute of Physics and Technology, Moscow, Russia 2 Federal Research Center for Computer Science and Control of the Russian Academy of Sciences, Moscow, Russia 3 AIRI, Moscow, Russia ali.za@phystech.edu, yakovlev@isa.ru

We consider the Anonymous Multi-Agent Path-Finding (AMAPF) problem where the agents are confined to a graph, a set of goal vertices is given, and each of these vertices has to be reached by some agent. The problem is to find an assignment of the goals to the agents as well as the collisionfree paths, and we seek to find the solution with the minimal makespan. A well-established approach to solving this problem is by reducing it to a special type of graph search problem, i.e. to the problem of finding a maximum flow on an auxiliary graph induced by the input one. The size of the former graph may be very large, and the search on it may become a bottleneck. To this end, we suggest a specific search algorithm that leverages the idea of exploring the search space not through considering separate search states but rather bulks of them simultaneously. That is, we implicitly compress, store, and expand bulks of the search states as single states, reducing the runtime and memory consumption. Empirically, the resultant AMAPF solver demonstrates superior performance compared to the state-of-the-art competitor and is able to solve all publicly available MAPF instances from the wellknown Moving AI benchmark in less than 30 seconds.

Introduction The Multi-Agent Path Finding (MAPF) problem is a problem which generally asks to find a set of collision-free paths for a set of agents that operate in a shared environment and have to reach predefined goal locations from the current (start) ones. MAPF has many applications, including automated warehouses, autonomous vehicles, and video games and is being widely studied in the literature. Depending on the application, many variants of MAPF have been proposed (Stern et al. 2019) and numerous solutions have been already presented. One variant is the Anonymous MAPF (AMAPF). In AMAPF, the agents are interchangeable and each agent may be assigned to any goal, assuming that, in the end, all goal locations will be reached (in case the number of goal locations is smaller than or equal to the number of agents) or each agent will arrive to one goal location (otherwise). This problem naturally arises in such environments where the tasks can be performed by any agent, e.g. identical robots carrying packages/inventory pods in automated warehouses.

Copyright 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

In this work we aim to solve the AMAPF problem optimally w.r.t makespan cost function, which is the arrival time of the last agent. In other words, our task is to find a solution where the last agent arrives at its goal location as early as possible. State-of-the-art optimal AMAPF solvers are reduction-based, i.e. the initial problem is reduced to another one and the latter is solved with an off-shelf solver. In the case of AMAPF, the common reduction is the following. Based on the input graph, another one is constructed. Then a maximum flow problem on this auxiliary graph, called a network, is formulated and solved. The latter can be interpreted as finding several paths (subject to certain constraints) on the reduced network. The major bottleneck here is that the size of the network is much larger, both in the number of vertices and edges, compared to the initial AMAPF graph, and, therefore, finding paths on it is burdensome. Moreover, the AMAPF reduction scheme, in general, assumes that numerous networks may be consecutively constructed (each one being larger than the previous one) and the search should be repeated. To this end, we present an improved optimal AMAPF solver that follows the reduction-to-the-flow-problem approach while utilizing a novel search method to find paths on the (flow) networks. The crux of our search method is the concept of bulk states and implicit expansions. In brief, instead of generating and expanding numerous search states, we compress them into the bulks that form a sequence, exploiting the special structure of the underlying network, and explicitly store in the search tree only the certain representatives of those bulks (while implicitly reasoning about all other states in the bulk). On the theoretical side, we show that our search method, dubbed Bulk Search, is complete. On the practical side, we compare our improved AMAPF solver that utilizes Bulk Search with the state-of-the-art optimal AMAPF solver and show that our algorithm notably scales better to large maps (due to significantly lower number of expansions when finding the paths on the flow networks) and outperforms the competitor on all maps of the well-known MAPF benchmark from (Stern et al. 2019).

Related Works In a conventional MAPF formulation (Stern et al. 2019) a set of agents is given as well as the specification where each agent starts and the goal it should reach. Even when

The Thirty-Eighth AAAI Conference on Artiﬁcial Intelligence (AAAI-24)

both the time is discretized into time steps and workspace is discretized into a graph (which are the two default assumptions in MAPF), obtaining an optimal solution w.r.t. one of the most widely-used objectives, e.g. the makespan or the sum-of-costs, is known to be NP-hard (Yu and La Valle 2013b). Surprisingly, the AMAPF problem, which is a combined problem of both MAPF and goal assignment, can be optimally solved w.r.t. makespan (but not the sum-ofcosts) in polynomial time (Yu and La Valle 2013a). The seminal method, introduced in (Yu and La Valle 2013a), is based on the reduction of AMAPF to a series of specific graph-search problems, i.e. the problems of finding a maximum flow (Ford Jr and Fulkerson 2015) on a graph of special structure (network) induced by the input MAPF graph. For the sum-of-costs objective an adaptation of the seminal MAPF solver, Conflict-Based Search (CBS) (Sharon et al. 2015) was suggested in (H onig et al. 2018). Indeed, this algorithm is not polynomial. Suboptimal AMAPF was studied in (Okumura and D efago 2022); several computationally efficient algorithms were proposed in this work which were empirically shown to provide high-quality solutions. However, no bound on sub-optimality was theoretically guaranteed. A variant of the AMAPF problem with some additional practically inspired assumptions, i.e., that the number of goals exceeds the number of agents and thus agents have to move to the new goals upon completing the current ones, was explored in (Nguyen et al. 2017) and solved using the Answer Set Programming (ASP).

More involved variants of AMAPF were studied in (Ma and Koenig 2016; Bart ak, Ivanov a, and ˇSvancara 2021). It was assumed that the agents are partitioned into the teams (colors) and each team is assigned a set of interchangeable targets (of the same color). In (Ma and Koenig 2016), a combination of CBS and min-cost max-flow algorithm (Ford Jr and Fulkerson 2015) was suggested to solve this Colored MAPF problem. Bart ak, Ivanov a, and ˇSvancara (2021) proposed several solvers that utilize reduction to SAT. Indeed, AMAPF can be viewed as a special instantiation of the Colored MAPF problem (i.e. the one when there exists only a single team of agents of the same color as all the goals).

Among the other problems that are closely related to AMAPF, one can name Lifelong MAPF (LMAPF) (Li et al. 2021) and Multi-agent Pickup and Delivery (MAPD) (Ma et al. 2017). These MAPF variants assume that the agents continuously operate in the environment reaching the specified goals (associated with certain pickup-and-delivery tasks in the case of MAPD). However, the assignments of goals (tasks) to agents is commonly assumed to be realized by an external procedure and, thus, the assignment sub-problem is not typically considered as part of the LMAPF/MAPD problem. Still, there are papers that consider a combined problem (Chen et al. 2021; Xu et al. 2022).

Finally, a body of works studies AMAPF in continuous domains, i.e. not assuming that the agents are confined to a given graph but are rather allowed to freely move in the (geometric) workspace (Adler et al. 2015; Solovey and Halperin 2016).

Problem Statement We follow a classical approach (Stern et al. 2019) to define the problem under investigation AMAPF. We consider a graph G = (V, E), whose vertices correspond to the locations in the environment and edges to the transitions between them. k agents are confined to this graph, i.e., initially each agent occupies a (distinct) vertex si, the start vertex, and at each time step of the discretized timeline, it can either wait in its current vertex or move to an adjacent one. The duration of both types of actions (move or wait) is 1 time step. k goal vertices, g1, ..., gk, are also distinguished. It is assumed that any agent can reach any goal, i.e., there is no pre-defined assignment of agents to the goals. A plan for an agent, π(s, g), is a sequence of (move/wait) actions, s.t. it begins at vertex s and ends at vertex g; each action in the plan starts where the the previous one ends. The cost of the plan is the time step by which g is reached. Two plans are said to contain a vertex (similarly, an edge) conflict if the agents following them occupy the same vertex (use the same edge) at the same time step. The problem now is to find a set of plans Π = {π1, ..., πk}, s.t. (1) each pair of plans is conflict-free and (2) all goals are reached. Essentially, this problem is a combination of the assignment problem, where one needs to decide which agent goes to which goal, and the (multi-agent) pathfinding problem, where one needs to construct a set of conflict-free plans. We consider the following cost objective: makespan(Π) = maxi 1,...,k(cost(πi)), where cost(π) is the cost of the individual plan (i.e. the earliest time step when the agent reaches a goal vertex and never moves away). In this work, we are interested in obtaining makespan-optimal solutions of the problem at hand (AMAPF).

Background Network Flow Generally, network flow problem might come in different flavors; see (Ahuja, Magnanti, and Orlin 1995) for an overview. Here we focus on a specific variant of the problem needed for solving AMAPF problems. A network is a tuple N = (G, cap, s, g), where G = (V, E) is a directed graph, cap : E Z+ is the mapping defining the capacities of the edges, s V is the source vertex, and g V is the sink vertex. For vertex v V , let σ+(v) (resp. σ (v)) denote the set of edges of G going to (resp. leaving) v. A feasible s, g-flow on the network is mapping f : E Z+ that satisfies three types of constraints: edge capacity constraints:

e E, f(e) cap(e), (1)

the flow conservation constraints at non-terminal vertices:

v V \ {s, g}, X

e σ+(v) f(e) X

e σ (v) f(e) = 0, (2)

and the flow conservation constraints at terminal vertices:

e σ (s) f(e) = X

e σ+(g) f(e). (3)

The Thirty-Eighth AAAI Conference on Artiﬁcial Intelligence (AAAI-24)

The quantity F(f) is called the value of the flow f. Another interpretation of the flow is that the flow is a set of s-g paths (possibly overlapping or even duplicating), where each path caries a unit of flow from s to g, such that the sum of units passing through any edge does not exceed its capacity. The standard single-commodity maximum flow problem asks the following question: Given a network N, what is the maximum F(f) that can be pushed through the network? Alternatively, find a set of s-g paths that carry the maximum units of flow through the network.

From AMAPF to Network Flow In (Yu and La Valle 2013b), the authors reduced the T-steps AMAPF problem, i.e. the one that allows any agent to do at most T actions, to a maximum flow (MF) problem. Specifically, it was proven that a T-steps AMAPF problem has a solution iff the reduced MF problem has a flow equal to the number of agents. The makespan for an AMAPF instance can therefore be found by finding the smallest T such that T-steps AMAPF instance has a solution. We now explain the suggested reduction with a slight modification suggested by (Liu et al. 2019) that simplifies it. Consider a T-steps AMAPF instance with the graph G = (V, E). We first create 2T +1 copies of V and mark them as follows: 0, 1, 1 , 2, 2 , ..., T ; see Fig. 1. Hereinafter, we will use the term vertices to denote the elements of the original AMAPF graph and the term nodes to denote the elements of the constructed network. We will also call copies t (with apostrophe) and copies t for t = 0, 1, ..T as outer and inner copies, respectively. The copied vertices of the original graph form the nodes of the network. Each node is identified by (v, h), where h stands for the copy, alternatively referred to as height. Indeed, the node (v, h) (as well as (v, h )) corresponds to the state of an agent located in vertex v at time step h. Then, for each edge e(u, v) in the original graph, we connect the nodes (u, h ) and (v, h+1) for h > 0, and also connect (u, 0) and (v, 1). These edges correspond to the move actions, and we call them the move edges. Then, we add the wait edges that connect the nodes (u, h ) and (u, h + 1) for h > 0 and the nodes (u, 0) and (u, 1). Additionally, we add the edges between the nodes (u, h) and the (u, h ). These edges do not denote any action but are added to forbid any two s-g paths in the network from sharing any node and, therefore, to avoid node-collisions. We call them the restriction edges. Finally, we add the source s and sink g nodes and connect s to all nodes (u, 0) where u is a start vertex, and connect g to nodes (v, T ) where v is a goal vertex. The capacity of any edge is fixed to be 1. Hereinafter, whenever a path in a network is mentioned we will mean the s-g path. The matching between the AMAPF and MF is now straightforward. Each plan for an agent can be matched to a path in the network by matching the agent actions to the move or wait edges and using the restriction edges to connect between them. Similarly, a path in the network is matched to a plan for an agent where the move/wait edges are matched to move/wait actions. See Fig. 1 for a self-contained example. It was proven that there are no shared nodes in any two paths in

0 1 1' 2 2'

0 1 1' 2 2'

0 1 1' 2 2' T'

0 1 1' 2 2' g T'

Figure 1: Example of the flow network (left) for a T-steps AMAPF instance (right). Each line in the flow network represents the copies of a single vertex in the original AMAPF graph. The diagonal, solid horizontal, and dashed horizontal edges in the network denote move, wait, and restriction edges, respectively. In this example, the AMAPF instance has two start vertices A, B and two goal vertices C, D, so the source node s in the network is connected to the nodes (A, 0), (B, 0) and the sink g is connected to (C, T ), (D, T ). The example also shows the matching between the plans for the agents and the s-g paths in the network (green and yellow edges).

the network (that form the solution to the MF problem), which infers that all matched plans have no node-collisions. An edge-collision may happen if two paths pass the edges ((u, h ) (v, h + 1)) and ((v, h ) (u, h + 1)) as these two different edges in the network refer to the same edge in the original graph. However, using the approach suggested in (Liu et al. 2019), these collision can be eliminated in the following fashion. Instead of two conflicting agents moving to their next vertices, they swap plans and continue moving by the other s plan. As a result, the AMAPF solution can be obtained by finding the maximum number of paths (maximum flow) in the described network.

Solving Maximum Flow

Ford-Fulkerson algorithm (Ford and Fulkerson 1956) was suggested in (Yu and La Valle 2013b) to solve the maximum flow problem. The algorithm is simple and easy to implement; its complexity on the reduced networks is O(k EV ) (formulated in (Yu and La Valle 2013b)), where E, V stand for the number of nodes and edges in the original graph. Since, in our MF problem, all the edges have a capacity of one and the value of the flow is bounded by the number of agents (as each path is to be matched to an agent s plan), Ford-Fulkerson can be considered as one of the most efficient solvers for our case (see (Cruz-Mej ıa and Letchford 2023) for more details about Maximum Flow algorithms). We refer the reader to the original paper for the detailed description of Ford-Fulkerson while briefly describing how the algorithm works specifically for the reduced network. In particular, the following two steps are sequentially repeated. First, we find a path p from s to g in the network. Second, we reverse all edges of p. We keep repeating these two steps until no path can be found. The found paths form the maximum flow in the network (in our case, they form the plans of the agents in the original AMAPF problem) 1.

1Arxiv version of this paper contains an illustrative example.

The Thirty-Eighth AAAI Conference on Artiﬁcial Intelligence (AAAI-24)

Solving MF Efficiently On The Introduced Networks The maximum value of T needed to solve an AMAPF problem can be up to k + V 2, as shown by (Yu and La Valle 2013b). This implies that the network size may be quadratic in the number of vertices of the original graph. Therefore, finding an s-g path may become a bottleneck when solving AMAPF instances involving large input graphs. To this end, we propose an algorithm that takes advantage of the specific structure of the reduced network and, as a result, is able to solve AMAPF problems much faster. First, we present some definitions to use in the algorithm description. In our algorithm, the search state corresponds to the network node. We will use n(v, h) to denote the search state. Recall that the network node is defined by the vertex in original graph, v, and the height (copy) h (a higher node means a higher copy, and h is higher than h). Let us define a connected-sequence as a sequence of nodes with the same vertex in which we can achieve the last node from the first node using only wait and restriction not-reversed edges, and it cannot be extended in either side (i.e., it has the maximum length). An example is shown in Fig. 2. Initially, for each vertex v of the original graph, we have only one connected-sequence (v, [0, T ]) i.e. the one that starts with the node with height 0 and ends with height T (shown by yellow crossbars). After a path (green path) is found, its edges are reversed (red edges in the lower graph). As a result, some connected-sequences disconnect which leads to several connected-sequences at the same vertex (see the lower network).

Idea The suggested algorithm is a graph traversal algorithm where the order of the search states in OPEN (the set of nodes that are the candidates to be expanded at the next iteration of the algorithm) is determined by their heights, i.e., the states with the lower heights are expanded first. The crucial idea of the algorithm is to expand states in bulks while searching. That is, instead of expanding one search state, i.e. generating its successors and marking it as visited (adding it to CLOSED), in each search iteration, we (implicitly) expand a bulk of states at once. Such bulk expansion can be effectively implemented using the introduced notion of the connected sequence, resulting in the reduction of the expansions number, time and memory compared to expanding states individually. Next, we describe how we can form the bulks of states and how the successors can be found compactly and fast. We note that a similar but more specific concept was used in (Phillips and Likhachev 2011) and (Gonzalez, Dornbush, and Likhachev 2012) to implicitly compress and expand the states generated by the wait actions of agents 2. Let us assume that while traversing the graph naively by

2The mentioned algorithms were originally tailored to graphbased pathfinding in the presence of dynamic obstacles where each vertex of the graph had to be annotated with the (safe) time intervals and the search utilized implicit move-then-wait actions. Our algorithm on the other hand handles explicit graphs, and the notion of connected-sequence in general is not obligated to relate to the time dimension, as will be shown later in the paper.

single nodes, we are to expand a state (v, h) located in a connected-sequence (v, [hmin, hmax]). We can note that the nodes with the higher heights inside the connected-sequence (i.e. the nodes (v, x) : x [h + 1, hmax]) are all achievable from (v, h) via the wait and restriction edges. Hence, the idea is to directly (and implicitly) generate all of these states ((v, x) : x [h + 1, hmax]), form an implicit bulk consisting of these states (including the originally picked-up node (v, h)), and expand them all at once (instead of expanding only (v, h)). We will refer to the described mechanism of generating the sequential successors of states that uses only wait and restriction edges as straightforward generation. Note that it is enough to store the vertex and the height bounds of the bulk to define it. So, technically, when forming a bulk (for future expansion), we only need to find the last straightforwardly achievable state (i.e. the highest state in the connected-sequence). When the bulk is ready, it is expanded in the following fashion. First, let us assume the naive expansion of a bulk when, for every node that resides in it, we generate all of its immediate successors. Now observe that, as a result of such procedure, we are likely to have numerous successors that are characterized by the same graph vertex and different heights. Moreover, many of these successors may belong to the same connected-sequence. Thus, instead of explicitly generating them, we generate only the ones with lower heights in their sequences. The other search states from these sequences will be straightforwardly generated later on (i.e., when the search state with the lowest copy is picked for expansion and its bulk is formed as described above). In other words, to expand a bulk of states (v, [hl, hu]), we do the following. We iterate over all connected-sequences in neighbor vertices of v, in which we can achieve at least one node (from a node (v, x) : x [hl, hu]). Then, we only generate the accessible node with the minimum height in each of these connected-sequences. The periodic structure of the reduced networks allows us to quickly find the node with the minimal height (as we will show later). As the number of connected-sequences is much less than that of individual nodes, it leads to a high reduction in the number of generated states. We call the algorithm that utilizes the described concepts Bulk Search (BS). Next, we describe the details of the implementation.

Details Algorithm 1 shows the pseudo-code of a graph traversal algorithm with our modifications. First, we order the states inside the search set (OPEN) by their height, i.e., we always choose the state with the minimum height for the expansion. This will help us reduce the number of expansions as will be shown later. The second change is that whenever we need to check whether a chosen state was expanded before, we additionally check if the state can be straightforwardly generated from the previously opened states. In this case, we do not need to expand it, as we assume that it was implicitly expanded before and its successors were already generated and inserted into the search set. This can be done by checking if any state in the same connectedsequence and lower height was expanded before (i.e. stored in the CLOSED set) (lines 11-14). This check should also

The Thirty-Eighth AAAI Conference on Artiﬁcial Intelligence (AAAI-24)

Algorithm 1: Bulk Search

Input: Network N(G, cap, s, g) Output: Path from s to g if exists 1: OPEN ϕ, CLOSED ϕ 2: insert s OPEN 3: while OPEN = ϕ do 4: remove n(v, h) with the minimum h from OPEN 5: if n = g then 6: return path from s to n 7: end if 8: if n CLOSED then 9: continue 10: end if 11: x(v, h ) the state from CLOSED in the same connectedsequence of n and minimum height 12: if x exists and h <= h then 13: continue 14: end if 15: succ get Successors(n) 16: for n (u, h ) in succ do 17: if n OPEN CLOSED then 18: continue 19: end if 20: x(u, h ) the state from OPEN CLOSED in the same connected-sequence of n and minimum height 21: if x exists and h <= h then 22: continue 23: end if 24: insert n OPEN 25: end for 26: end while 27: return no answer

Algorithm 2: Generating successors

Input: Network N(G, cap, s, g), Node n(v, h) Output: The successors of n(v, h) 1: if n = s then 2: return all neighbor nodes of n in G 3: end if 4: succ ϕ 5: [hmin, hmax] height bounds of the connected-sequence where n is located 6: if h = hmin then 7: insert all neighbor nodes of n in G succ 8: end if 9: insert all neighbor nodes of node (v, hmax) in G succ 10: for each connected-sequence cs(u, [hl, hu]) with vertex u is a neighbor of v do 11: c from the height of the minimum outer copy in [max(hmin + 1, h), hmax 1] 12: cto the height of the minimum inner copy in [hl, hu] 13: if c from + 1 <= hu and cto 1 <= hmax then 14: cmin max(cto, c from + 1) 15: insert (u, cmin) succ 16: end if 17: end for 18: return succ

be done when inserting new states into the OPEN set (lines 20-23). The third change is how we generate the successors of all

0 1 1' 2 2'

0 1 1' 2 2'

0 1 1' 2 2' g T'

0 1 1' 2 2'

0 1 1' 2 2'

0 1 1' 2 2' g T'

Figure 2: Example showing connected-sequences on the network. Yellow crossbars denote the connected-sequences on each vertex. Initially, we have the connected-sequences as shown in the upper figure. After a path (green one) is found and its edges are reversed, the connected-sequences are divided as shown in the lower figure.

states (the taken one from OPEN along with its straightforwardly generated states) fast. The pseudo-code is presented in Algorithm 2. Firstly, (lines 1-3), if the node is the source node s, we have only one node as input (i.e. no implicit states), and, thus, we need to only generate neighboring successors (i.e. connected nodes in the network) and return them. Otherwise, let the input node n(v, h) (which is not s or g) located in the connected-sequence (v, [hmin, hmax]), then we should generate the successors of all states n(v, x) : x [h, hmax]. This can be done as follows. If the state (v, h) is located at the beginning of the connected-sequence (i.e. h = hmin), this state may have reversed edges, so we always generate all neighboring nodes of this state in the network. The same thing is applied on (v, hmax) where we also generate all its neighboring nodes in the network. Other nodes (i.e. nodes (v, x) : x [max(hmin +1, h), hmax 1]) have only move edges to connect to nodes in other connectedsequences, so we can do the following (lines 10-17) to generate their successors. We iterate over all connected-sequences cs in neighbor vertices of v. We then check whether we can achieve at least one node in cs (from nodes (v, x) : x [max(hmin + 1, h), hmax 1]). As shown in lines 11-13, this can be done by checking whether there is at least one move edge which starts from an outer copy in [max(hmin + 1, h), hmax 1] and ends at inner copy in cs. If so, we generate the node with the minimum accessible height in cs and add it to the successors set (lines 14-15). As a result, any achievable node beginning from the source node is either explicitly or implicitly (from a node with lower height in the same connected-sequence) expanded. Therefore, we can immediately state the following theorem. Theorem 1. BS is a complete algorithm.

Theoretical analysis The search states inside the search set are sorted according to the height. This helps to reduce the number of expansions as the lower-height states implicitly expand the higher-height states in the same connectedsequence, but the opposite is not applicable. In theory, mul-

The Thirty-Eighth AAAI Conference on Artiﬁcial Intelligence (AAAI-24)

01 10 20 30

SR percentage (%)

flow-BS flow

Figure 3: The (normalized) number of instances solved by a certain time cap.

tiple single states in one connected-sequence may be expanded individually, e.g., if they have been opened in the descending-height way. This is possible even if we order the states according to the heights as we have reversed edges which generates states with a lower height. However, in practice, only few states are expanded in each connectedsequence, and, thus, the algorithm s performance mainly depends on the number of connected-sequences. Fortunately, the number of connected-sequences is much smaller than the size of the network. Initially, we have a number of connected-sequences equals V , the number of original graph vertices. After each path is found and reversed, T new reversed edges, and therefore, T new connected-sequences appear. As a result, in the whole search for all k agents, the number of the appearing connected-sequences equals Pi=k i=1 |V | + T(i 1) = k|V |+Tk(k 1)/2. Therefore, we have a theoretical reduction in the number of nodes (comparing with k|V |T, the number of nodes without compressing in connected-sequences) equals min(|V |/k, T/2) (i.e. (k|V | + Tk(k 1)/2) min(|V |/k, T/2) <= k|V |T 3). This reduction is significantly high, which allows us to obtain the fastest full success (to our knowledge) in optimally solving all public MAPF benchmarks for a MAPF-family problem, as will be shown in the next section.

Experimental Evaluation We have implemented the improved AMAPF solver in C++4 and compared it with the state-of-the-art AMAPF solver that does not use the introduced Bulk Search to solve MF but rather utilizes the standard Ford-Fulkerson as suggested in (Yu and La Valle 2013b). The code of the competitor was taken from public repository 5 that accompanied the paper (Okumura and D efago 2022). We kept all the optimization techniques designed by the code authors. We will denote these two solvers as flow-BS (ours) and flow (state of the art). The experiments were conducted on a PC with Intel Core i7-10700F CPU @ 2.90GHz 16 and 32Gb of RAM. The MAPF maps and instances were taken from the publicly available MAPF benchmark (Stern et al. 2019). We use all 33 maps available in this benchmark and all of the 25

3See Arxiv version of the paper for details. 4https://github.com/Path Planning/AMAPF-MF-BS 5https://github.com/Kei18/tswap

Map Width Height Algorithm flow flow-BS empty-8-8 8 8 100% 100% empty-16-16 16 16 100% 100% maze-32-32-2 32 32 100% 100% room-32-32-4 32 32 100% 100% maze-32-32-4 32 32 100% 100% random-32-32-20 32 32 100% 100% random-32-32-10 32 32 100% 100% empty-32-32 32 32 100% 100% empty-48-48 48 48 100% 100% den312d 65 81 100% 100% room-64-64-8 64 64 100% 100% random-64-64-20 64 64 100% 100% room-64-64-16 64 64 100% 100% random-64-64-10 64 64 100% 100% warehouse-10-20-10-2-1 161 63 100% 100% ht chantry 162 141 100% 100% maze-128-128-1 128 128 3% 100% ht mansion n 133 270 96% 100% warehouse-10-20-10-2-2 170 84 100% 100% lt gallowstemplar n 251 180 82% 100% maze-128-128-2 128 128 4% 100% ost003d 194 194 57% 100% lak303d 194 194 44% 100% maze-128-128-10 128 128 17% 100% warehouse-20-40-10-2-1 321 123 91% 100% den520d 256 257 16% 100% w woundedcoast 642 578 1% 100% warehouse-20-40-10-2-2 340 164 8% 100% brc202d 530 481 1% 100% Paris 1 256 256 256 5% 100% Berlin 1 256 256 256 6% 100% Boston 0 256 256 256 4% 100% orz900d 1491 656 0% 100%

Table 1: Success rates of flow-BS and flow solvers with a timeout of 30 seconds.

random scenarios. Each scenario on each map (except some small ones) contains 1,000 pairs of start-goal positions. To test a solver on a scenario, we run it with the first 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, and 1,000 pairs of start-goals sequentially. Whenever the solver fails to solve a problem under a time limit of 30 seconds, we terminate testing on this scenario and move to the next one. In the first experiment, we have used a precise estimator of T suggested by (Okumura and D efago 2022) (which solves bottleneck assignment problem (Gross 1959)) to estimate the lower bound of the makespan. As this estimator takes non-negligible time (up to 10 seconds) and both algorithms use it, we have not accounted for its runtime. For each map, we have collected the total number of success instances over all scenarios. Table 1 summarizes the results. As can be seen, our solver is able to solve all instances in all maps under 30s. However, this is not the case for flow solver. The latter searches the network node-by-node and, therefore, is able to solve the test instances only on the small maps. When the maps are large (like city maps that are 256 256) or the makespan is high (like in the maze maps), it is often unable to produce a solution under the imposed

The Thirty-Eighth AAAI Conference on Artiﬁcial Intelligence (AAAI-24)

26 27 28 29 1000 #Agents

#Average Expanded Nodes

1e8 Berlin_1_256 (256x256)

flow flow-BS

26 27 28 29 1000 #Agents

#Average Expanded Nodes

1e7 lt_gallowstemplar_n (180x251)

flow flow-BS

26 27 28 29 1000 #Agents

#Average Expanded Nodes

1e6 random-64-64-20 (64x64)

flow flow-BS

Figure 4: The number of expanded nodes with a different number of agents on different maps.

Tmin/4 Tmin/2 Tmin Tmin * 2

#Expanded Nodes

1e8 maze-128-128-10 (128x128)

flow, k=256 flow, k=1000 flow-BS, k=256 flow-BS, k=1000

Tmin/4 Tmin/2 Tmin Tmin * 2

#Expanded Nodes

1e7 warehouse-20-40-10-2-2 (164x340)

flow, k=256 flow, k=1000 flow-BS, k=256 flow-BS, k=1000

Tmin/4 Tmin/2 Tmin Tmin * 2

#Expanded Nodes

1e6 room-64-64-16 (64x64)

flow, k=256 flow, k=1000 flow-BS, k=256 flow-BS, k=1000

Figure 5: The figure shows (for a specific map and a specific number of agents) how the number of expansions changes with the increasing values of T. Here Tmin denotes the optimal makespan.

time limit. Fig. 3 shows the success rate for all instances in all maps second by second. In fact, flow-BS is able to solve the hardest instance (orz900d map, scenario 20 with 256 agents) in 17s. On the other hand, flow has shown a very slight increase of SR after solving the easy instances ( 75% of instances) around the 8th second.

In the second experiment, we have investigated how the number of agents affects the performance. For this purpose, we selected three maps of different topology and size and plotted the average number of the expanded nodes against the number of agents. The results (plotted in Fig. 4) show that our algorithm expands much fewer nodes than the standard algorithm (as expected). It is worth noting that the relation between the runtime and the number of expansions is not fixed but increases for flow, too. This is because a single memory-access/read/write operation takes longer when the amount of the stored data increases.

So far, we have assumed that we have an estimator which can identify an exact bound (T) of the makespan. However, this may be not the case, so the search may be repeated several times until finding the optimal solution. Therefore, we have also conducted experiments to show the practical performance of both solvers for different T when we fix the map and the number of agents. The tests are designed as follows. For each one of the three fixed maps, warehouse-20-40-10-2-2, maze-128-128-10, room-64-64-16, and for a number of agents {256, 1000}, we run the solvers on networks with a maximum height equals one of the values

{Tmin/4, Tmin/2, Tmin, Tmin 2}, where Tmin is the optimal makespan. Again, we have recorded the average number of expanded nodes over all scenarios (see Fig. 5). Obviously, the number of nodes expanded by flow significantly increases with the value of T, while flow-BS does not demonstrate such growth. It means that our solver is especially useful when one cannot estimate the optimal makespan accurately before actually solving an AMAPF problem instance.

In this paper, we have revisited the reduction-based approach to optimally solving the Anonymous MAPF problem, when the latter is reduced to a search problem on an auxiliary graph of a special structure. We have suggested an improved AMAPF solver that is based on a specific search algorithm tailored to find paths on the auxiliary graphs exploiting their specific topology. We have shown that our improved AMAPF solver significantly outperforms the stateof-the-art one on a large variety of setups, leveraging its better scalability to the size of the input graph. Next, we plan to extend the proposed search technique to support costs (e.g. Min-Cost-Maximum-Flow) to solve other MAPF problems.

Acknowledgments

This work was partially supported by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002; grant No. 70-2021-00138).

The Thirty-Eighth AAAI Conference on Artiﬁcial Intelligence (AAAI-24)

Adler, A.; De Berg, M.; Halperin, D.; and Solovey, K. 2015. Efficient multi-robot motion planning for unlabeled discs in simple polygons. In Algorithmic Foundations of Robotics XI: Selected Contributions of the Eleventh International Workshop on the Algorithmic Foundations of Robotics, 1 17.

Ahuja, R. K.; Magnanti, T. L.; and Orlin, J. B. 1995. Network flows: theory, algorithms and applications. Prentice Hall.

Bart ak, R.; Ivanov a, M.; and ˇSvancara, J. 2021. From classical to colored multi-agent path finding. In Proceedings of the 14th International Symposium on Combinatorial Search (So CS 2021), 150 152.

Chen, Z.; Alonso-Mora, J.; Bai, X.; Harabor, D. D.; and Stuckey, P. J. 2021. Integrated task assignment and path planning for capacitated multi-agent pickup and delivery. IEEE Robotics and Automation Letters, 6(3): 5816 5823.

Cruz-Mej ıa, O.; and Letchford, A. N. 2023. A survey on exact algorithms for the maximum flow and minimum-cost flow problems. Networks.

Ford, L. R.; and Fulkerson, D. R. 1956. Maximal flow through a network. Canadian journal of Mathematics, 8: 399 404.

Ford Jr, L. R.; and Fulkerson, D. R. 2015. Flows in networks, volume 54. Princeton university press.

Gonzalez, J. P.; Dornbush, A.; and Likhachev, M. 2012. Using state dominance for path planning in dynamic environments with moving obstacles. In 2012 IEEE International Conference on Robotics and Automation, 4009 4015. IEEE.

Gross, O. 1959. The bottleneck assignment problem. Rand.

H onig, W.; Kiesel, S.; Tinka, A.; Durham, J.; and Ayanian, N. 2018. Conflict-based search with optimal task assignment. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems.

Li, J.; Tinka, A.; Kiesel, S.; Durham, J. W.; Kumar, T. S.; and Koenig, S. 2021. Lifelong multi-agent path finding in large-scale warehouses. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), 11272 11281.

Liu, M.; Ma, H.; Li, J.; and Koenig, S. 2019. Task and path planning for multi-agent pickup and delivery. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).

Ma, H.; and Koenig, S. 2016. Optimal Target Assignment and Path Finding for Teams of Agents. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), 1144 1152.

Ma, H.; Li, J.; Kumar, T.; and Koenig, S. 2017. Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks. In Proceedings of The 16th Conference on Autonomous Agents and Multi Agent Systems (AAMAS 2017), 837 845. International Foundation for Autonomous Agents and Multiagent Systems.

Nguyen, V.; Obermeier, P.; Son, T. C.; Schaub, T.; and Yeoh, W. 2017. Generalized target assignment and path finding using answer set programming. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017), 1216 1223. Okumura, K.; and D efago, X. 2022. Solving Simultaneous Target Assignment and Path Planning Efficiently with Time-Independent Execution. In Proceedings of the International Conference on Automated Planning and Scheduling, volume 32, 270 278. Phillips, M.; and Likhachev, M. 2011. Sipp: Safe interval path planning for dynamic environments. In 2011 IEEE international conference on robotics and automation, 5628 5635. IEEE. Sharon, G.; Stern, R.; Felner, A.; and Sturtevant, N. R. 2015. Conflict-based search for optimal multi-agent pathfinding. Artificial Intelligence, 219: 40 66. Solovey, K.; and Halperin, D. 2016. On the hardness of unlabeled multi-robot motion planning. The International Journal of Robotics Research, 35(14): 1750 1759. Stern, R.; Sturtevant, N. R.; Felner, A.; Koenig, S.; Ma, H.; Walker, T. T.; Li, J.; Atzmon, D.; Cohen, L.; Kumar, T. K. S.; Boyarski, E.; and Bartak, R. 2019. Multi-Agent Pathfinding: Definitions, Variants, and Benchmarks. Symposium on Combinatorial Search (So CS), 151 158. Xu, Q.; Li, J.; Koenig, S.; and Ma, H. 2022. Multi Goal Multi-Agent Pickup and Delivery. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022). Yu, J.; and La Valle, S. M. 2013a. Multi-agent path planning and network flow. In Algorithmic foundations of robotics X, 157 173. Springer. Yu, J.; and La Valle, S. M. 2013b. Planning optimal paths for multiple robots on graphs. In 2013 IEEE International Conference on Robotics and Automation, 3612 3617. IEEE.

The Thirty-Eighth AAAI Conference on Artiﬁcial Intelligence (AAAI-24)