This repo maintains the code for the paper, which won the on the ICRA 2025 best paper on multi-robot systems and the best student paper.
There are some other amazing repos involved and maintained in the lmapf_lib
folder.
- Guided-PIBT
- learn-to-follow
- MAPFCompetition2023: The Winning Solution of the League of Robot Runner Competition 2023. The League of Robot Runner Competition 2024 has a stronger winner: EPIBT, take a look at it if you are interested in search-based approaches.
- RHCR
Examples for training and evaluation are provided as scripts below. All training and evaluation heavily rely on experiment configs in the expr_configs
.
But this repo is quite messy now for those who need to modify the internal code, because it involves both complex search- and learning-based methods, builds upon a distributed framework for large-scale training, and contains much more than what we have in the paper.
On the hand, the ideas conveyed by the paper are actually straightforward and it is easy to implement for any other works.
Please contact me (reverse: moc.liamxof@rivers ) if you have any questions.
Due to the complexity of the project, we maintain Backward Dijkstra, Static Guidance and Dynamic Guidance versions in separate branches (static_guidance
,dynamic_guidance
). This branch is for Static Guidance versions. (The Backward Dijkstra version is almost the same as the Static Guidance version. We only need to set the map_weights_path
to ""
in all the expr_configs
.)
The implementation of Dynamic Guidance Version is more complex than the Static Guidance Version. So, try to run code in the static_guidance
branch first.
-
For reproduction results of Static Guidance on the main benchmark of this paper, please refer to the line 4 of the Table IV in the appendix. (using
eval.sh
.) -
For reproduction results of Backward Dijkstra on the main benchmark of this paper, please refer to the line 2 of the Table IV in the appendix. (using
eval.sh
and set MODEL_FOLDER or MAP_WEIGHTS_PATH.) -
For reporduction results of Backward Dijkstra on the learn-to-follow benchmark. Please refer to the Figure 9 in the appendix. (using
eval_ltf.sh
.)
./compile.sh
- Configuration files are defined in the folder
expr_configs/paper_exps_v3
. - Map reader and generator are defined in the file
light_malib/envs/LMAPF/map.py
. The benchmark data in this paper is in thelmapf_lib/data/papere_exp_v3
folder. They use the same data format as the competition League of the Robot Runner 2023. - Environment is defined in the file
light_malib/envs/LMAPF/env.py
. - Training logic is defined in the file
light_malib/framework/ppo_runner.py
. - Rollout function (simulation) are defined in the file
light_malib/rollout/rollout_func_LMAPF.py
. - Neural network models are defined in the folder
light_malib/model/LMAPF
. - Pretrained weights are in the folder
pretrained_models
. - The training logs are by default in the folder
logs
. Tensorboard can be used to monitor the training. The subfolderagent_0
will contain the weight checkpoints. - There are several important c++ wrappers for
PIBT
andParallel LNS
defined in the fileslmapf_lib/MAPFCompetition2023/tools/py_PIBT.cpp
andlmapf_lib/MAPFCompetition2023/tools/py_PLNS.cpp
. Backward Dijkstraj heuristics are precomputed by the c++ wrapper defined in the filelmapf_lib/MAPFCompetition2023/tools/py_compute_heuristics.cpp
. For example, in the environment classLMAPFEnv
, you can see how they are loaded.
See eval.sh
for how to evaluate on the benchmark of this paper.
See eval_ltf.sh
for how to evaluate on the benchmark of the learn-to-follow paper.
See train.sh
. The training successfully starts for imiation learning if you see something similar to the following figure, which states that the algorithm starts to collecting training data.
Please take a look at lines with NOTE in comments in the example configuration file expr_configs/paper_exps_v3/small/bootstrap_from_pibt_iter1_sortation_small_a600_s500_none_annotated_1gpu.yaml
. These lines are those ones you probably want to modify in your experiments.
I usually train models with 4 RTX4090D (24GB) and roughly 64 vCPUs. But since it is imitation learning fundamentally, less computational resources also work (need to modifythe experiment configs).
expr_configs/paper_exps_v3/small/bootstrap_from_pibt_iter1_sortation_small_a600_s500_none_annotated_1gpu.yaml
gives an example for 1 RTX4090D (24GB memory) and 16 vCPUs (80GB memory).
Depending on the numbers of CPUs, GPUs and their memories: you may need to adjust the following parameters in the configuration files:
- framework
- max_dagger_iterations
- num_episodes_per_iter
- evaluation_manager:
- num_eval_rollouts
- rollout_manager:g
- num_workers: the number of works to collect data. must < the total number of cpus. because some other cpus are used for other processes according to the design of the distributed computation framework, Ray.
- batch_size
- eval_batch_size
- training_manager
- batch_size: should be kept the same as the
rollout_manager.batch_size
. - num_trainers: should be set the number of GPUs to use.
- batch_size: should be kept the same as the
For example, the default number in other configurations is for 4 GPUs and 64 vCPUs and if you want to use 1 GPUs and 16 vCPUs, you can divide the number of rollout_manager.num_workers
and training_manger.num_trainers
by 4. Since you probably don't want to wait for too long in the training because of the reduction in computational resources, you can divide other parameters mentioned above by 4. (CPU and GPU memories might also be the reason for reducing these parameters.)
You can take a look at the last tens of lines in the file light_malib/envs/LMAPF/map.py
. We use the same data format as the competition League of the Robot Runner 2023.
- recompile everything in an empty env to check the dependencies.
- add more documentation.
- organize/re-write code.