This repository contains the reference implementation for the paper "Mean-Field RL for Large-Scale Unit-Capacity Pickup-and-Delivery Problems". See the paper: Mean-Field RL for Large-Scale Unit-Capacity Pickup-and-Delivery Problems.
- Create a fresh Python environment (3.9 recommended).
- Install exact dependencies:
pip install -r requirements.txt
- Install JAX with CUDA (adjust CUDA version as needed):
pip install -U "jax[cuda12]"
- Generate datasets (uniform, clustered, cities) as needed using scripts under
datasets/. - Run k-means preprocessing for clustering datasets:
python preprocessing.py
Main entrypoint is exp_run.py which wraps PPO training for the mean-field VRP environment. Example usage:
python exp_run.py \
--seed=0 \
--config=0 \
--load_datasets=0 \
--dataset=clustered \
--k=5 \
--N=500 \
--timesteps=300000000 \
--save_full=1Saved artifacts are written under results/flax_ckpt/<exp_dir>/ including learned parameters and brief metrics.
train_mfvrp_g.py: Training for limiting MFVRP problem with staged step sizes. (MFVRP-G)train_mfvrp_g_extra.py: Same as above with--extra_run=1; used for a rank-1 parametrization of actions.eval_finetune_sweeps.py: Finetuning and cross-dataset evaluation sweeps (uniform/cities) using a pretrained checkpoint. (MFVRP-F)
After experiments, generate figures and tables via the plotting and eval scripts.
Ensure logs (exp0.log, etc.) and checkpoints exist as expected before plotting.
We include simple baselines for comparison:
greedy_approach/greedy_approach.pyfor greedy baselinepyvrp_approach/run_pyvrp.pyfor PyVRP baselineortools/vrp.pyfor OR-Tools baseline