Coming soon........
VITA is a noise-free and conditioning-free policy learning framework that learns visuomotor policies by directly flowing from latent images to latent actions.
- π§ͺ Project Page
- π arXiv Paper
- π PDF
@article{gao2025vita,
title={VITA: Vision-to-Action Flow Matching Policy},
author={Gao, Dechen and Zhao, Boqi and Lee, Andrew and Chuang, Ian and Zhou, Hanchu and Wang, Hang and Zhao, Zhe and Zhang, Junshan and Soltani, Iman},
journal={arXiv preprint arXiv:2507.13231},
year={2025}
}