🎬 [Project Page], 📜 [Technical Report], 🤗 [Model Weights]
Shenyuan Gao, Siyuan Zhou, Yilun Du, Jun Zhang, Chuang Gan
TL;DR: AdaWorld is a highly adaptable world model pretrained with continuous latent actions from thousands of environments, enabling efficient action transfer, adaptation, and planning with minimal finetuning.
- Action transfer (source video → target scene)
- Visual planning (action-agnostic vs. AdaWorld)
We introduce latent actions as a unified condition for action-aware pretraining from videos. AdaWorld can readily transfer actions across contexts without training. By initializing the control interface with the corresponding latent actions, AdaWorld can also be adapted into specialized world models efficiently and achieve significantly better planning results.
Our idea is implemented based on Vista and Jafar. Thanks for their great open-source work!
If any parts of our paper and code help your research, please consider citing us and giving a star to our repository.
@article{gao2025adaworld,
title={AdaWorld: Learning Adaptable World Models with Latent Actions},
author={Gao, Shenyuan and Zhou, Siyuan and Du, Yilun and Zhang, Jun and Gan, Chuang},
journal={arXiv preprint arXiv:2503.18938},
year={2025}
}
If you have any questions or comments, feel free to contact me through email (sygao@connect.ust.hk). Suggestions and collaborations are also highly welcome!