ReinFlow ReinFlow

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Star History

📢 News

[2025/06/14] Updated webpage for a detailed explanation to the algorithm design.

🚀 Installation

Please follow the steps in installation/reinflow-setup.md.

🚀 Quick Start: Reproduce Our Results

To fully reproduce our experiments, please refer to ReproduceExps.md. To download our training data and reproduce the plots in the paper, please refer to ReproduceFigs.md.

🚀 Implementation Details

Please refer to Implement.md for descriptions of key hyperparameters of FQL, DPPO, and ReinFlow.

🚀 Adding Your Own Dataset or Environment

Please refer to Custom.md.

🚀 Debug Aid and Known Issues

Please refer to KnownIssues.md to see how to resolve errors you encounter.

⭐ Comming Soon

Support fine-tuning Mean Flow with online RL
Possible open-source the WandB projects via a corporate account. (currently is in .csv format)
Replace figs with videos in the drop-down menu of specific tasks in the webpage.

License

This repository is released under the MIT license. See LICENSE. If you use our code, we appreciate it if you paste the license at the beginning of the script.

Acknowledgement

This repository was developed from multiple open-source projects. Major references include:

TorchCFM, Tong et al.: Conditional flow-matching repository.
Shortcut Models, Francs et al.: One-step Diffusion via Shortcut Models.
DPPO, Ren et al.: DPPO official implementation.

For more references, please refer to Acknowledgement.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly