demo.mp4
-
[2025-06]
$I^2$ -World is accepted to ICCV 2025.
Install Pytorch 1.13 + CUDA 11.6
conda create --name ii-world python=3.8
conda activate ii-world
pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116
Install mmdet3d (v1.0.0rc4) related packages and build this project
pip install mmcv-full==1.7.0 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13/index.html
pip install mmdet==2.28.2
pip install mmsegmentation==0.30.0
pip install mmengine
pip install -v -e .
Install other dependencies
pip install -r requirements.txt
We utilize 8 RTX4090 GPUs to train our model.
Method | Dataset | Task | Rec.mIoU (%) | Rec.IoU (%) | Weights |
---|---|---|---|---|---|
II-Tokenizer | Occ3D-nus | Rec | 81.1 | 68.1 | Google-drive |
STCOcc-Res | Rec | 24.8 | 32.2 | - | |
Occ3D-Waymo | Rec | 76.3 | 74.6 | - | |
II-World | Occ3D-nus | Fore | 38.4 | 49.2 | Google-drive |
STCOcc-Res | Fore | 18.9 | 28.8 | - | |
Occ3D-Waymo | Fore | 43.7 | 60.9 | - |
-
Download nuScenes from nuScenes
-
Download Occ3D-nus from Occ3D-nus
-
(Optional) Download Occ3D-Waymo from Occ3D-Waymo and unzip it to the
data/waymo
folder. We only use the validation of Occ3D-Waymo in our experiments. -
(Optional) Download STCOcc-Res from STCOcc-Res and unzip it to the
data/nuscenes
folder. -
Download the generated info file from Google Drive and unzip it to the
data/nuscenes
folder. These*pkl
files can be generated by running thetools/create_data.py
-
(Optional) Download the visualization car model Google Drive
-
Organize your folder structure as below:
├── project
├── visualizer/
│ ├── 3d_model.obj/ (optional)
├── ckpts/
│ ├── ii_scene_tokenizer_4f.pth
│ ├── ii_generate_world.pth
├── data/
│ ├── nuscenes/
│ │ ├── samples/
│ │ ├── v1.0-trainval/
│ │ ├── gts/ (Occ3D-nus)
│ │ ├── stc-results/ (prediction from STCOcc) (optional)
│ │ ├── world-nuscenes_infos_train.pkl
│ │ ├── world-nuscenes_infos_val.pkl
│ ├── waymo(optional)/
│ │ ├── validation/
│ │ ├── cam_infos_vali.pkl/
│ │ ├── waymo_infos_val.pkl/
Train II-Tokenizer with 8GPUs:
bash tools/dist_train.sh configs/scene_tokenizer/ii_scene_tokenizer_4f.py 8
Evaluate II-Tokenizer with 6GPUs:
bash tools/dist_test.sh configs/scene_tokenizer/ii_scene_tokenizer_4f.py TO/CKPTS
Important
Before training or evaluating II-World, you should first evaluate the II-Tokenizer to generate the prediction tokens. By default, the II-Tokenizer will save the prediction tokens to data/nuscenes/save_dir/token_4f
folder.
You can change the test_data_config
in the tokenizer config for different datasets.
When generate the training set prediction tokens, you can set the ann_file
in test_data_config
to world-nuscenes_infos_train.pkl
Train II-World with 8GPUs:
bash tools/dist_train.sh configs/world_model/ii_generate_world.py 8
Evaluate II-World with 6GPUs:
bash tools/dist_test.sh configs/world_model/ii_generate_world.py TO/CKPTS
We provide a simple visualization to visualize the high-level control (utilize different cmd) of the world generation.
python tools/generate.py configs/world_model/ii_generate_world.py ckpts/ii_generate_world.pth \
--generate_path generate_output --generate_scene_name scene-0564 --generate_frame 12 --task_mode high-level-control
Also, you can visualize the generated world with the fine-grained control (utilize different transformation matrix)
python tools/generate.py configs/world_model/ii_generate_world.py ckpts/ii_generate_world.pth \
--generate_path generate_output --generate_scene_name scene-0270 --generate_frame 12 --task_mode generate
If you want to visualize the 3D occupancy map, you can set --save_npz
in the script above, and the generated 3D occupancy npz will be saved in the generate_output
folder.
Utilize the following command to visualize the generated 3D occupancy map:
python tools/vis_occ_3d.py --vis-single-data \PATH/TO/GENERATED/3D_OCCUPANCY.npz --vis-path demo_output
More visualization options can be found in the tools/vis_occ_3d.py
file.
Thanks to the following excellent projects: