# windowed_mapf_with_completeness_guarantees__97003861.pdf Windowed MAPF with Completeness Guarantees Rishi Veerapaneni, Muhammad Suhail Saleem, Jiaoyang Li, Maxim Likhachev Carnegie Mellon University {rveerapa, msaleem2, jiaoyanl, mlikhach}@cs.cmu.edu Traditional multi-agent path finding (MAPF) methods try to compute entire collision free start-goal paths, with several algorithms offering completeness guarantees. However, computing partial paths offers significant advantages including faster planning, adaptability to changes, and enabling decentralized planning. Methods that compute partial paths employ a windowed approach and only try to find collision free paths for a limited timestep horizon. While this improves flexibility, this adaptation introduces incompleteness; all existing windowed approaches can become stuck in deadlock or livelock. Our main contribution is to introduce our framework, Win C-MAPF, for Windowed MAPF that enables completeness. Our framework leverages heuristic update insights from single-agent real-time heuristic search algorithms and agent independence ideas from MAPF algorithms. We also develop Single-Step Conflict Based Search (SS-CBS), an instantiation of this framework using a novel modification to CBS. We show how SS-CBS, which only plans a single step and updates heuristics, can effectively solve tough scenarios where existing windowed approaches fail. Extended version https://arxiv.org/abs/2410.01798 1 Introduction A core problem for multi-agent systems is to figure out how agents should move from their current locations to their goal locations. Without careful consideration, agents can collide, get stuck in deadlock, or take inefficient paths which take longer to traverse. This Multi-Agent Path Finding (MAPF) problem is particularly tough in congestion or when the number of agents becomes very large (e.g. 100s). Full horizon MAPF methods attempt to find entire paths for each agent from their start to their goal. Many such MAPF methods additionally have theoretical completeness guarantees, i.e., they will find a solution if one exists given enough time and compute. However, planning partial paths has multiple advantages including decreasing planning time, adaptability to changes, and enabling decentralized planning. This is particularly useful in scenarios where planning a full horizon path may be tough (e.g. too many collisions to resolve) and planning a partial path is more feasible. Therefore existing windowed methods define a time window W and plan paths for each agent to the goal such that Copyright 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Tunnel Loopchain Connector Method Horizon 3 4 6 7 5 6 w CBS 1,2,4,8,16 - - - - - - CBS+ 0.9 - - - - - SS-CBS 1 1 1 1 0.95 1 1 Table 1: Success rates of different optimal approaches: windowed CBS (w CBS) and full horizon CBS with all optimizations enabled (CBS+). Horizon denotes window size. We test on small congested scenarios (number of agents written below each map) across 20 seeds and a 1 minute timeout. w CBS and CBS+ fail due to deadlock or timeout while SSCBS, with a single-step planning horizon, is able to solve these scenarios using our Win C-MAPF framework which maintains completeness guarantees. the first W timesteps account for inter-agent coordination and avoids collisions. This window W is typically much shorter than the entire solution path, e.g. W = 5 is common when the entire solution path spans 50 to 500 timesteps. As a result, windowed methods are significantly faster than those that compute the entire path. A key issue with windowed approaches is that their myopic planning results in deadlock or livelock. Table 1 shows examples where windowed MAPF solvers fail in congestion if their window is too small. More broadly, all existing windowed MAPF solvers regardless of window size lack theoretical completeness, and several windowed works have explicitly cited deadlock as a key issue in their experiments (Silver 2005; Li et al. 2020; Okumura et al. 2022; Jiang et al. 2024). Interestingly, no previous work has addressed completeness in windowed MAPF solvers. Our first main contribution is the introduction of the Windowed Complete MAPF framework, Win C-MAPF, designed to create Windowed MAPF solvers that guarantee completeness. Win C-MAPF is a general framework that leverages concepts from single-agent real-time heuristic search and the semi-independent structure of MAPF. First, we view Windowed MAPF in its joint configuration and show how we can apply real-time heuristic updates on the joint configuration to enable completeness. However, due The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25) to the large joint configuration space in a MAPF problem, naively applying heuristic updates is impractical. Thus, second, we leverage the semi-independent structure of MAPF problems to focus heuristic updates on disjoint agent groups simultaneously, resulting in efficient performance. An important module in the Win C-MAPF framework is an Action Generator (AG) that computes the next set of actions the agents need to execute. To guarantee completeness, the AG must (1) find the optimal windowed action that incorporates heuristic updates, and (2) identify disjoint agent groups. Our framework and proof encourage future research on the development of windowed AGs that satisfy these properties. As a starting point, our second main contribution is developing Single-Step Conflict-Based Search (SS-CBS), a CBSbased AG that follows the Win C-MAPF framework and plans only for a single timestep (W = 1). Conflict-Based Search (Sharon et al. 2015) can be easily modified to be windowed by only considering conflicts within the window, but we show how naively integrating heuristic updates can fail. Thus, SS-CBS introduces a novel heuristic conflict and constraint to address this issue. We empirically demonstrate how SS-CBS, with single step planning, outperforms windowed CBS with larger windows across both small and large instances. 2 Related Work 2.1 Problem Formulation Multi-Agent Path Finding (MAPF) is the problem of finding collision-free paths for a group of N agents i = 1, . . . , N, that takes each agent from its start location sstart i to its goal location sgoal i . In traditional 2D MAPF, the environment is discretized into grid cells, and time is broken down into discrete timesteps. Agents are allowed to move in any cardinal direction or wait in the same cell. A valid solution is a set of agent paths Π = {π1, ..., πN} where πi[0] = sstart i , πi[Ti] = sgoal i where Ti is the maximum timestep of the path for agent i. Critically, agents must avoid vertex collisions (when πi[t] = πj =i[t]) and edge collisions (when πi[t] = πj[t + 1] πi[t + 1] = πj[t]) for all timestep t. The typical objective in optimal MAPF is to find a solution Π that minimizes |Π| = PN i=1 |πi| = PN i=1 PTi 1 t=0 c(st i, st+1 i ). This work solves standard MAPF which has c(st i, st+1 i ) = 1 unless the agent is resting at its goal (where c(sgoal i , sgoal i ) = 0). Windowed MAPF Instead of resolving all collisions at once, windowed planners iteratively plan a smaller collisionfree path and execute, in essence breaking the problem into smaller more feasible chunks. For example, a recent MAPF competition, League of Robot Runners (Chan et al. 2024), required planning for hundreds of agents within 1 second and thus utilized windowed planning interleaved with execution. Windowed MAPF methods plan partial paths that only reason about collisions for timesteps t W where W is a hyper-parameter window size. Thus after the first W timesteps, the remaining path πi[W]...πi[Ti] is the agent s optimal path to the goal (in the absence of other agents) as it does not need to avoid collisions with other agents. Mathematically then, the cost of πi is PW 1 t=0 c(st i, st+1 i ) + c (s W i , sgoal i ) where c (s W i , sgoal i ) is the optimal cost to the goal. We note that all performant 2D MAPF methods compute a backward dijkstra s for each agent where h i (s) = c (s, sgoal i ). Thus instead of fully planning π0:Ti i , we can equivalently plan just the windowed horizon π0:W i and minimize |Π| = PN i=1(PW 1 t=0 c(st i, st+1 i ) + h i (s W i )). 2.2 MAPF Methods There exist many different types of heuristic search solvers for MAPF. One old approach is Prioritized Planning (Erdmann and Lozano-Perez 1987) which assigns priorities to agents and plans them sequentially with later agents avoiding earlier agents. PIBT (Okumura et al. 2022) is a recent popular method that allows agents to inherit other agents priorities. Conflict Based Search (Sharon et al. 2015) is another popular method that decoupled the planning problem into two stages. A high-level search resolves collisions between agents by applying constraints while a low-level search finds paths for individual agents that satisfy constraints. There are many extensions to CBS that improve the searches as well as the applied constraints (Barer et al. 2014; Li, Ruml, and Koenig 2021; Li et al. 2021b). When faced with shorter planning times, methods typically simplify the planning problem to just find partial collision-free paths. Windowed Hierarchical Cooperative A* (Silver 2005) is a windowed variant of Hierarchical Cooperative A* which is essentially a prioritized planner using a backward Dijkstra s heuristic. Silver notes that this is not complete due to their use of priorities. Rolling Horizon Conflict Resolution (RHCR) applies a rolling horizon for lifelong MAPF planning and replans paths at repeated intervals (Li et al. 2020). RHCR faces deadlock and attempts to combat it by increasing the planning window but still notes that their method is incomplete. Bounded Multi-Agent A* (Sigurdson et al. 2018) proposes that each agent runs its own limited horizon real-time planner considering other agents as dynamic obstacles. However, the method acknowledges that deadlock occurs when agents need to coordinate with one another. Planning and Improving while Executing (Zhang et al. 2024) is a recent work that attempts to quickly generate an initial full plan using La CAM (Okumura 2022) and then refines it during execution using LNS2 (Li et al. 2022). However, if a complete plan cannot be found, the method resorts to using the best partial path available, making it incomplete in such situations. The winning submission (Jiang et al. 2024) to the Robot Runners competition, due to the tight planning time constraint, leveraged windowed planning of PIBT with LNS (Li et al. 2021a). They explicitly note deadlock in congestion is a significant challenge. To the best of our knowledge, there does not exist any windowed MAPF solver with completeness guarantees. 2.3 Real-Time Single Agent Search We leverage ideas from Real-Time Search, a single-agent heuristic search problem where, due to limited time constraints, the agent is required to iteratively plan and execute partial paths. Despite repeatedly planning partial paths, Real-Time Search methods are designed to maintain completeness. The main innovation in single-agent real-time search literature is that the agent updates (increases) the heuristic value of encountered states. This prevents deadlock/livelock as states that are repeatedly visited have larger and larger heuristic values which encourages the search to explore other areas. A large variety of real-time algorithms such as LRTA* (Korf 1990), RTAA* (Koenig and Likhachev 2006), and LSS-LRTA* (Koenig and Sun 2009) propose to update the heuristic in different ways. Section A.1 contains a formal proof of how completeness guarantees hold in Real-Time Search. The core idea is that given a current state s, cost function c, and a partial path leading to a new state s W , we update the heuristic value via a standard bellman update equation h(s) U(s, s W ) := max(h(s), c(s, s W ) + h(s W )). Now if an agent does not reach the goal, it must be stuck in deadlock/livelock and repeatedly visiting some states s Sstuck. Under certain (achievable) conditions, updating the heuristic can be shown to repeatedly increase h(s) for s Sstuck (Korf 1990). At some point then, if the agent is optimally planning to the state s W that minimizes c(s, s W ) + h(s W ), it will eventually pick an s W / Sstuck. Thus most single-agent Real Time Search algorithms require an optimal planner to maintain completeness. We note that in the above proof sketch, the planning window W does not matter, nor does how many steps in the partial plan the agent chooses to move (e.g. committing to only one step or executing all W steps). Completeness holds regardless if W 1 and the agent moves at least one step. 3 Windowed-MAPF with Guarantees This section describes our Windowed Complete MAPF framework, Win C-MAPF, for creating windowed MAPF solvers that guarantee completeness. We leverage two key insights. Our first insight is that we can apply single-agent real-time update ideas to MAPF planning if we interpret the MAPF problem as a single-agent problem in the combined joint state space. This allows us to update the heuristics of previously seen states enabling completeness. However, just doing this is ineffective due to the large state space. To this extent, our second insight is that we can leverage MAPF s agent semi-independence to intelligently update the heuristic value of multiple states, allowing the search to fill in heuristic depressions quickly and exit local minima faster. 3.1 Planning in Joint Configuration Space We can leverage single-agent Real-Time Search literature (Section 2.3) if we view our multi-agent problem in the joint configuration space of all agents. Here is a quick conceptual but mathematically imprecise summary: We view the N agents in their joint configuration C. At every timestep, we query the action generator to return a valid sequence of actions that leads to a configuration C which minimizes c(C, C )+h(C ). We move the agent along the path as well as update the heuristic of C via h(C) c(C, C )+h(C ). We can then prove completeness by directly applying single agent real-time search literature. We now formally define the above. Given N agents, we define the configuration Ct = [st 1, ...st N]. At every planning iteration, we query a high-level action generator to return a sequence of configurations Π0:W = [C0, ..., CW ] which minimizes |Π| = |Π0:W | + |ΠW :Tmax| = PW 1 t=0 c(Ct, Ct+1) + h BD(CW ). We define the joint cost and heuristic intuitively, c(C, C ) = PN i=1 c(si, s i) and h BD(C) = PN i=1 h i (si) (BD = Backward Dijkstra). Heuristic Penalties Now given the transition (partial path) C CW from the AG, we update h(C) U(C, CW ) := max(h(C), c(C, CW )+h(CW )) (1) h(C) is initially set to h BD(C) and gradually increases as agents visit configurations. We use the term heuristic penalty (HP) configuration for visited configurations whose heuristic increased via the update equation to denote how the updated heuristic penalizes those configurations and encourages the search to explore others. An HP configuration C has a nonzero increase from the base heuristic value, i.e. h(C) = h BD(C) + hp(C) where the penalty hp(C) > 0. Action Generator Given h(C) which includes states with HPs and without, we want an optimal action generator AG that finds arg min CW c(C, CW ) + h(CW ). If so, we get the same completeness guarantees by directly reusing the single-agent proof. We again note that completeness holds for any W 1 and regardless of how many steps along the partial path the agents execute before replanning. 3.2 Disjoint Agent Groups The framework we have described so far suffers from an obvious issue; the number of configurations C grows exponentially with the number of agents. Consequently, escaping local minima, which in our context are agents stuck in deadlock/livelock, can require updating the heuristic of an impractical number of states. Our key idea is therefore to compute and apply heuristic penalties to specific groups of agents instead of on the entire configuration. This significantly speeds up performance as it allows the search to focus on specific stuck agents instead of on all agents. Thus given a configuration transition C CW , we decompose all the agents into disjoint agent groups such that agents between different groups are not interacting with each other. For example, in Figure 1, we would like to determine that the blue and green agents are blocking each other but that the orange agents are not. Definition 1 (Disjoint Agent Groups). Given a configuration transition C CW , and set of disjoint agent groups {Gri}, we have the property that for each agent Rj with transition sj s W j in disjoint agent group Gri, there cannot exist another agent in a different group Grk that blocked Rj from picking a better path. Conceptually, this means that each group s decision to make CGri CW Gri is independent of agents in other groups. Instead of computing disjoint groups where agents are not interacting, it is easier to determine coupled agent groups where they could be interacting. Definition 2 (Coupled Agents). Given a configuration transition C CW , an agent Ri is coupled with Rj if Rj prevents Ri from choosing a better path or vice-versa. Note that coupled agents must be in the same disjoint agent group Gr. Importantly, disjoint agent groups do not Agent Group Heuristic Penalties Heuristic Penalties Heuristic Penalties: Action Generator Next Actions Agent Groups Action Generator Next Actions Agent Groups Figure 1: The iterative planning, execution, and grouped heuristic updates/penalties for Win C-MAPF. need to solely consistent of coupled agents, i.e. it is okay for extra non-coupled agents (e.g. agents independent of all others) to be in disjoint groups. From an abstract perspective, we can compute this by iterating through each agent Ri which is not on its optimal single-agent path and storing the ids of the other agents which prevented it from picking a better path (there must exist at least one other agent Rj otherwise Ri could have gone on its optimal path). This builds a dependency graph where agents that share an edge Ri Rj denote Ri blocked by Rj. We can then find all the disjoint connect components in this dependency graph (e.g. via a DFS) where each disjoint connected component depicts a group of agents Gri that are blocking each other and are independent from other groups. Instead of updating h(C) U(C, CW ), we do h(CGri) U(CGri, CW Gri) for each group of agents Gri at configuration CGri. Our objective is, given C, CW , to detect these groups of agents and apply heuristic penalties to just the group of agents rather than the entire configuration. One crucial observation is that it can be non-trivial to determine which agents are coupled once the AG returns the next state. Agents could be next to each other but not block each other, or on the flip side be non-adjacent but coupled. However, instead of reasoning about coupled agents after the AG, we can leverage the AG itself as it must have internally reasoned about agent interactions to return a valid next action. All modern MAPF heuristic search planners reason about agent interactions internally, e.g., M* explicitly couples agents that intersect (Wagner and Choset 2011), PIBT s priority inheritance reasons about colliding agents, and CBS resolves collisions between intersecting agents. Thus, we require that the AG additionally returns groups of interacting agents. We highlight that this can be generally done with bookkeeping and without much added compute to existing MAPF solvers. Given a configuration C and a set of HPs for groups of agents at various CGr, we then compute h(C) = h BD(C) + P i hp(CGri) for a disjoint set of groups {CGri} whose locations match the configuration. We lastly note that Disjoint Agents Groups is related to Independence Detection in Operator Decomposition (Standley 2010) which dynamically constructs independent sub- Algorithm 1: Windowed Complete MAPF Framework 1: procedure WINDOWED COMPLETE MAPF(Ccur) 2: H Heuristic Penalties 3: while Ccur = Goal do 4: CW , List Of Groups = AG(Ccur, H) 5: for Gr List Of Groups do For each group 6: hn U(Ccur Gr , CW Gr) Equation (1) 7: penalty hn h BD(CW Gr) 8: if penalty > 0 then 9: H.insert(Ccur Gr , penalty) 10: Ccur C0 h(si+1) as c(si, si+1) > 0. This occurs over all si which leads to h(s1) > h(s2) > ... > h(s N) > h(s1) which is a contradiction. Therefore there must be some state that gets its heuristic value increased, e.g. si s.t. h (si) c(si, si+1) + h(si+1) > h(si). Thus, LRTA* is guaranteed to eventually leave any finite cycle. Since a finite bidirectional graph has a finite set of cycles without the goal, LRTA* at worst will explore all of these and eventually exit them and reach the goal. Subtleties This proof for completeness assumes that h (si) c(si, si+1) + h(si+1) h(si). This holds when h is consistent but may not hold when h is admissible. Thus for arbitrary h we should update via h (si) max(h(si), c(si, si+1)+h(si+1)) (Eq. 1) and then reuse the same completeness proof. This proof also assumes that si 1 has a different state that can be picked, i.e. it has more than one neighbor. If there exists a path in the cycle to the goal, then there must be some si 1 where this holds and we apply this logic for that si 1. Lastly, this proof was shown for one-step planning but can be directly applied to planning for W steps. It is also independent of how many steps along the W length partial path the agent chooses to move (as long as it moves at least one step along the path). A.2 Win C-MAPF Proof From a high level, the LRTA* proof proves completeness by showing that in a cycle and given the update h (si) c(si, si+1) + h(si+1), the heuristic values in the cycle must increase infinitely large to the extent that LRTA* will pick a different s i that avoids the large heuristic value. This requires/assumes that only heuristic values in the cycle increases while other heuristic values do not. A key idea of Win C-MAPF is to compute heuristic penalties on the configuration of disjoint agent groups rather than the entire configuration. Note that without this, i.e., if we just compute heuristic penalties on the full configuration, we get completeness by directly applying the single-agent proof. Thus, using group HPs complicates the proof of completeness as when computing/creating a heuristic penalty, we increase the heuristic values of configurations not in the cycle. Instead, we prove completeness in Theorem 1 by showing how our overall heuristic values are always admissible. If we can show this, then we are guaranteed to not cycle infinitely as this will infinitely increase some heuristic value and contradict our admissible heuristic value guarantee. Theorem 3. Given a finite bidirectional graph and: (1) an initial Backward Dijkstra heuristic, (2) our AG picks arg min CW c(C, CW )+h(CW ) and identifies disjoint agents groups, then Win C-MAPF with its update equation (Eq. 1) applied on group configurations will always have admissible heuristic values for all C. We calculate the joint heuristic value by summing up mutually exclusive group HPs as described in Section B.1. If we can prove that each group HPs values are admissible, then the sums of disjoint grouped heuristic values is also admissible as groups interacting can only increase the true cost-to-go. Thus we prove the following lemma. Lemma 1. Given a finite bidirectional graph and: (1) an initial Backward Dijkstra heuristic, (2) our AG picks arg min CW c(C, CW )+h(CW ) and identifies disjoint agents groups, then Win C-MAPF with its update equation (Eq. 1) applied on group configurations will always have admissible heuristic values for all group configurations CGr. Proof. We prove this via induction. Our inductive hypothesis is that we have h(CGr) h (CGr) across all group Gr configurations CGr and iterations of running our Windowed MAPF framework. Base case: Our initial group heuristic values, the sum of each agent s backward Dijkstra distance, is admissible as interactions between agents can only increase the solution cost. Thus over all groups of agents, the joint heuristic value is admissible. Inductive Step: We assume that at some timestep T at C, all the heuristic values are admissible. The AG picks a CW that minimizes c(C, CW ) + h(CW ). Then for each group, we update its configuration (via our heuristic penalty) to h(CGr) U(CGr, CW Gr) + h(CGr W). Suppose that there is a group Gr A at CGr A whose heuristic value became inadmissible after applying U. This can only occur when the chosen CW Gr A was not optimal and there must exist a different C Gr A where U(CGr A, C Gr A) would be admissible (if this di. However, since we are running on optimal AG, we would have replaced C Gr A with C Gr A. If this did not interact with any other agents, this would reduce our overall cost without changing any other agents locations/actions and contradict the optimal AG. If it did interact with other agents, then Gr A would need to be larger and include these other agents, violating our definition of disjoint agent groups. Thus, neither of these are possible. This means that our heuristic value stays admissible for all disjoint agent groups after updating. B SS-CBS Subtleties B.1 Determining Which HPs to Apply As discussed earlier, give a CT node we need to detect heuristic conflicts. This can be done by going through all the HPs and checking if their agent locations match with the CT s configuration. However, a non-trivial issue occurs as it is possible that multiple non-disjoint heuristic conflicts could be detected. For example, a configuration could have a heuristic penalties involving (R1, R2) and another involving (R2, R3). We cannot apply both as this would double count the heuristic penalty involving R2. Thus more broadly, we cannot apply multiple non-disjoint heuristic conflicts as if we were to resolve all of them, this will over-count the heuristic penalty. Given a set of possible HPs, we could try to maximize the combination of these penalties such that chosen penalties do not have overlapping agents. This ends up being a maximum weighted disjoint set-cover problem where each heuristic penalties is a set and the set s weight is the penalty. We note that solving just disjoint set-cover is NP-Complete (Cardei and Du 2005), so solving this efficiently is hard. We thus instead do a greedy approximation and choose to apply the heuristic conflict with the highest penalty. One important implementation note is that initially we greedily chose the highest HP that was mutually exclusive (i.e. did not share agents) with prior chosen HPs. This is wrong and resulted in deadlock as the order in which HPs are chosen affects the solution. In certain deadlock locations, say (R1, R2, R3), SS-CBS would always first encounter an HP with (R1, R2) and apply that. Then later in the same CT search, (R1, R2, R3) is encountered but we did not apply an HP as R1 and R2 are already used up. This resulted in SS-CBS repeatedly picking the deadlock location as even though (R1, R2, R3) accumulated large penalties, SS-CBS would only apply the (R1, R2) penalty. Thus, every time we compute heuristic penalties and conflicts, we recompute it from scratch while not violating constraints. This allows us to initially pick (R1, R2) and then afterwards reassign (R1, R2, R3) later on. B.2 Tie-Breaking Given the 1-step plans, it is likely that there are many SSCBS solutions with an equally good cost. We found it helpful to tiebreak by introducing agent priorities . The intuition is that given a symmetric situation of two agents in a hallway (where all adjacent configurations have the same summed h value), we prefer one agent to push the other agent away rather than oscillate in the middle. Concretely, if two CT nodes have the same f-value, hvalue, g-value (cost), and number of conflicts, we tie-break nodes by lexicographically comparing the agent s h-values. This means that given equally good options, we would prefer a solution that reduces the first lexicographically sorted agent s heuristic more than other options. The agent s lexicographical ordering can be thought of as its tiebreaking priority (i.e., the agent that comes first is the highest priority). We additionally found assigning random priorities which are updated as in PIBT helped performance.4 Acknowledgments The research was supported by the National Science Foundation under grant #2328671, by the National Science Foundation Graduate Research Fellowship Program under grant #DGE2140739, and a gift from Amazon. The views and conclusions in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the sponsoring organizations, agencies, or the U.S. government. Barer, M.; Sharon, G.; Stern, R.; and Felner, A. 2014. Suboptimal variants of the conflict-based search algorithm for the multi-agent pathfinding problem. In Seventh Annual Symposium on Combinatorial Search. 4See extended version Cardei, M.; and Du, D. 2005. Improving Wireless Sensor Network Lifetime through Power Aware Organization. Wirel. Networks, 11(3): 333 340. Chan, S.-H.; Chen, Z.; Guo, T.; Zhang, H.; Zhang, Y.; Harabor, D.; Koenig, S.; Wu, C.; and Yu, J. 2024. The League of Robot Runners Competition: Goals, Designs, and Implementation. In ICAPS 2024 System s Demonstration track. Erdmann, M.; and Lozano-Perez, T. 1987. On multiple moving objects. Algorithmica, 2(1): 477 521. Jiang, H.; Zhang, Y.; Veerapaneni, R.; and Li, J. 2024. Scaling Lifelong Multi-Agent Path Finding to More Realistic Settings: Research Challenges and Opportunities. In Proceedings of the International Symposium on Combinatorial Search, volume 17, 234 242. Koenig, S.; and Likhachev, M. 2006. Real-time adaptive A*. In Nakashima, H.; Wellman, M. P.; Weiss, G.; and Stone, P., eds., 5th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2006), Hakodate, Japan, May 8-12, 2006, 281 288. ACM. Koenig, S.; and Sun, X. 2009. Comparing real-time and incremental heuristic search for real-time situated agents. Autonomous Agents and Multi-Agent Systems, 18: 313 341. Korf, R. E. 1990. Real-time heuristic search. Artificial Intelligence, 42(2): 189 211. Li, J.; Chen, Z.; Harabor, D.; Stuckey, P. J.; and Koenig, S. 2021a. Anytime Multi-Agent Path Finding via Large Neighborhood Search. In Dignum, F.; Lomuscio, A.; Endriss, U.; and Now e, A., eds., AAMAS 21: 20th International Conference on Autonomous Agents and Multiagent Systems, Virtual Event, United Kingdom, May 3-7, 2021, 1581 1583. ACM. Li, J.; Chen, Z.; Harabor, D.; Stuckey, P. J.; and Koenig, S. 2022. MAPF-LNS2: Fast Repairing for Multi-Agent Path Finding via Large Neighborhood Search. Proceedings of the AAAI Conference on Artificial Intelligence, 36(9): 10256 10265. Li, J.; Harabor, D.; Stuckey, P. J.; and Koenig, S. 2021b. Pairwise Symmetry Reasoning for Multi-Agent Path Finding Search. Co RR, abs/2103.07116. Li, J.; Ruml, W.; and Koenig, S. 2021. EECBS: A boundedsuboptimal search for multi-agent path finding. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 12353 12362. Li, J.; Tinka, A.; Kiesel, S.; Durham, J. W.; Kumar, T. K. S.; and Koenig, S. 2020. Lifelong Multi-Agent Path Finding in Large-Scale Warehouses. In Proceedings of the 19th International Conference on Autonomous Agents and Multi Agent Systems, AAMAS 20, 1898 1900. Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems. ISBN 9781450375184. Okumura, K. 2022. La CAM: Search-Based Algorithm for Quick Multi-Agent Pathfinding. ar Xiv:2211.13432. Okumura, K.; Machida, M.; D efago, X.; and Tamura, Y. 2022. Priority inheritance with backtracking for iterative multi-agent path finding. Artificial Intelligence, 310: 103752. Sharon, G.; Stern, R.; Felner, A.; and Sturtevant, N. R. 2015. Conflict-based search for optimal multi-agent pathfinding. Artificial Intelligence, 219: 40 66. Sigurdson, D.; Bulitko, V.; Yeoh, W.; Hern andez, C.; and Koenig, S. 2018. Multi-Agent Pathfinding with Real-Time Heuristic Search. In 2018 IEEE Conference on Computational Intelligence and Games (CIG), 1 8. Silver, D. 2005. Cooperative Pathfinding. In Proceedings of the First AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE 05, 117 122. AAAI Press. Standley, T. S. 2010. Finding Optimal Solutions to Cooperative Pathfinding Problems. In Fox, M.; and Poole, D., eds., Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, July 11-15, 2010, 173 178. AAAI Press. Stern, R.; Sturtevant, N. R.; Felner, A.; Koenig, S.; Ma, H.; Walker, T. T.; Li, J.; Atzmon, D.; Cohen, L.; Kumar, T. K. S.; Boyarski, E.; and Bartak, R. 2019. Multi-Agent Pathfinding: Definitions, Variants, and Benchmarks. Symposium on Combinatorial Search (So CS), 151 158. Veerapaneni, R.; Kusnur, T.; and Likhachev, M. 2023. Effective Integration of Weighted Cost-to-Go and Conflict Heuristic within Suboptimal CBS. In Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Washington, DC, USA, February 7-14, 2023, 11691 11698. AAAI Press. Veerapaneni, R.; Wang, Q.; Ren, K.; Jakobsson, A.; Li, J.; and Likhachev, M. 2024. Improving Learnt Local MAPF Policies with Heuristic Search. International Conference on Automated Planning and Scheduling, 34(1): 597 606. Wagner, G.; and Choset, H. 2011. M*: A complete multirobot path planning algorithm with performance bounds. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 3260 3267. Zhang, Y.; Chen, Z.; Harabor, D.; Bodic, P. L.; and Stuckey, P. J. 2024. Planning and Execution in Multi-Agent Path Finding: Models and Algorithms. Proceedings of the International Conference on Automated Planning and Scheduling, 34(1): 707 715.