This repository contains scripts for evaluating DROID policies in a simple ISAAC Sim environment.
Here is an example rollout of a pi0-FAST-DROID policy:
Scene 1
Scene 2
Scene 3
The simulation is tuned to work zero-shot with DROID policies trained on the real-world DROID dataset, so no separate simulation data is required.
Note: The current simulator works best for policies trained with joint position action space (and not joint velocity control). We provide examples for evaluating pi0-FAST-DROID policies trained with joint position control below.
Clone the repo
git clone --recurse-submodules git@github.com:arhanjain/sim-evals.git
cd sim-evals
Install uv (see: https://github.com/astral-sh/uv#installation)
For example (Linux/macOS):
curl -LsSf https://astral.sh/uv/install.sh | sh
Create and activate virtual environment
uv sync
source .venv/bin/activate
Install extra dependencies
./submodules/IsaacLab/isaaclab.sh -i
First, make sure you download the simulation assets and unpack them into the root directory of this package. Using the AWS CLI, this can be done with:
aws s3 cp s3://openpi-assets-simeval/env_assets/simple_example/assets.zip ./ --no-sign-request
unzip assets.zip
Then, in a separate terminal, launch the policy server on localhost:8000
.
For example, to launch a pi0-FAST-DROID policy (with joint position control),
checkout openpi to the karl/droid_policies
branch and run the command below in a separate terminal
XLA_PYTHON_CLIENT_MEM_FRACTION=0.5 uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_fast_droid_jointpos --policy.dir=s3://openpi-assets-simeval/pi0_fast_droid_jointpos
Note: We set XLA_PYTHON_CLIENT_MEM_FRACTION=0.5
to avoid JAX hogging all the GPU memory (since Isaac Sim needs to use the same GPU).
Finally, run the evaluation script:
python run_eval.py --episodes [INT] --scene [INT] --headless
env_cfg.set_scene(scene) # pass scene integer
env = gym.make("DROID", cfg=env_cfg)
obs, _ = env.reset()
obs, _ = env.reset() # need second render cycle to get correctly loaded materials
client = # Your policy of choice
max_steps = env.env.max_episode_length
for _ in tqdm(range(max_steps), desc=f"Episode"):
action = client.infer(obs, INSTRUCTION) # calling inference on your policy
action = torch.tensor(ret["action"])[None]
obs, _, term, trunc, _ = env.step(action)
if term or trunc:
break
env.close()