Skip to content

pzhren/Surfer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Surfer: A World Model-Based Framework for Vision-Language Robot Manipulation

[arXiv] [Website].

SeaWave

SeaWave is a Robotic Manipulation with Progressive Reasoning Tasks benchmark based on a realistic robotic manipulation simulator. Specifically, the SeaWave benchmark builds a new high-fidelity digital twin scene based on Unreal Engine 5, which includes 40K natural language instructions generated by ChatGPT for a detailed evaluation of robot manipulation.

Simulator

Pipeline

pipeline

Environment

Resource Consumption

In our experiments, we used 1 NVIDIA GeForce RTX 3090 GPU. And the simulator occupies approximately 2 to 3GB of GPU memory.

Simulator

See simulator details.

Training

python src/main.py

Test

python src/eval.py

Citation

@misc{ren2024surferprogressivereasoningworld,
      title={Surfer: Progressive Reasoning with World Models for Robotic Manipulation}, 
      author={Pengzhen Ren and Kaidong Zhang and Hetao Zheng and Zixuan Li and Yuhang Wen and Fengda Zhu and Mas Ma and Xiaodan Liang},
      year={2024},
      eprint={2306.11335},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2306.11335}, 
}

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

About

A World Model-Based Framework for Vision-Language Robot Manipulation

Resources

License

Stars

Watchers

Forks

Packages

No packages published