This is a forked repo to organize the official ipynb implementation of SayCan (Do As I Can, Not As I Say: Grounding Language in Robotic Affordances) for easier further research.
This fork enables a cli-like layer and also enables support for open-source llms like llama-3.1 supported via the transformers library.
This repository has been tested to be working with Ubuntu 22.04.2 and pip3==20.2.3 on the conda environment (instructions below).
Clone this repo. Create and activate new conda environment with python 3.9 as follows.
conda create -n saycan python=3.9.1
conda activate saycan
pip3 install pip==20.2.3
pip3 install -r requirements.txt
Ensure that you setup vllm and register your huggingface token.
The instructions below try downloading from official sources. If there are
any problems there, I also host the assets/ directory via this shared link.
Simply download it and unzip it in the project root directory: saycan.ROOT_DIR
If you still have issues (eg. broken links), email me by finding my email on my personal webpage rushangkaria.github.io
mkdir assets/
gdown -O assets/ 1Cc_fDSBL6QiDvNT4dpfAEbhbALSVoWcc
gdown -O assets/ 1yOMEm-Zp_DL3nItG9RozPeJAmeOldekX
gdown -O assets/ 1GsqNLhEl9dd4Mc3BM0dX3MibOI1FVWNM
unzip assets/ur5e.zip -d assets/
unzip assets/robotiq_2f_85.zip -d assets/
unzip assets/bowl.zip -d assets/
gsutil cp -r gs://cloud-tpu-checkpoints/detection/projects/vild/colab/image_path_v2 assets/
You can skip this process if you want to generate data by yourself with gen_data.py
.
Download pregenerated dataset by running
gdown -O assets/ 1yCz6C-6eLWb4SFYKdkM-wz5tlMjbG2h8
gdown -O assets/ 1Nq0q1KbqHOA5O7aRSu4u7-u27EMMXqgP
Don't forget to add your openai key in llm.py
.
If you have downloaded the pretrained policy in 2.4, you can now run demo.py
to visualize the evaluation process.
If you want to train a model from scratch, run train.py
.