🌍 $I^2$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting

arXiv

demo.mp4

🚀 News

[2025-06] $I^2$-World is accepted to ICCV 2025.

🛠️Environment

Install Pytorch 1.13 + CUDA 11.6

conda create --name ii-world python=3.8
conda activate ii-world
pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116

Install mmdet3d (v1.0.0rc4) related packages and build this project

pip install mmcv-full==1.7.0 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13/index.html
pip install mmdet==2.28.2
pip install mmsegmentation==0.30.0
pip install mmengine
pip install -v -e .

Install other dependencies

pip install -r requirements.txt

🤗 Model Zoo

We utilize 8 RTX4090 GPUs to train our model.

Method	Dataset	Task	Rec.mIoU (%)	Rec.IoU (%)	Weights
II-Tokenizer	Occ3D-nus	Rec	81.1	68.1	Google-drive
	STCOcc-Res	Rec	24.8	32.2	-
	Occ3D-Waymo	Rec	76.3	74.6	-
II-World	Occ3D-nus	Fore	38.4	49.2	Google-drive
	STCOcc-Res	Fore	18.9	28.8	-
	Occ3D-Waymo	Fore	43.7	60.9	-

📦 Prepare Dataset

Download nuScenes from nuScenes
Download Occ3D-nus from Occ3D-nus
(Optional) Download Occ3D-Waymo from Occ3D-Waymo and unzip it to the data/waymo folder. We only use the validation of Occ3D-Waymo in our experiments.
(Optional) Download STCOcc-Res from STCOcc-Res and unzip it to the data/nuscenes folder.
Download the generated info file from Google Drive and unzip it to the data/nuscenes folder. These *pkl files can be generated by running the tools/create_data.py
(Optional) Download the visualization car model Google Drive
Organize your folder structure as below:

├── project
├── visualizer/
│   ├── 3d_model.obj/ (optional)
├── ckpts/
│   ├── ii_scene_tokenizer_4f.pth
│   ├── ii_generate_world.pth
├── data/
│   ├── nuscenes/
│   │   ├── samples/ 
│   │   ├── v1.0-trainval/
│   │   ├── gts/ (Occ3D-nus)
│   │   ├── stc-results/ (prediction from STCOcc) (optional)
│   │   ├── world-nuscenes_infos_train.pkl
│   │   ├── world-nuscenes_infos_val.pkl
│   ├── waymo(optional)/
│   │   ├── validation/ 
│   │   ├── cam_infos_vali.pkl/ 
│   │   ├── waymo_infos_val.pkl/

🎇 Training and Evaluation

Train II-Tokenizer with 8GPUs:

bash tools/dist_train.sh configs/scene_tokenizer/ii_scene_tokenizer_4f.py 8

Evaluate II-Tokenizer with 6GPUs:

bash tools/dist_test.sh configs/scene_tokenizer/ii_scene_tokenizer_4f.py TO/CKPTS

Important

Before training or evaluating II-World, you should first evaluate the II-Tokenizer to generate the prediction tokens. By default, the II-Tokenizer will save the prediction tokens to data/nuscenes/save_dir/token_4f folder.

You can change the test_data_config in the tokenizer config for different datasets.

When generate the training set prediction tokens, you can set the ann_file in test_data_config to world-nuscenes_infos_train.pkl

Train II-World with 8GPUs:

bash tools/dist_train.sh configs/world_model/ii_generate_world.py 8

Evaluate II-World with 6GPUs:

bash tools/dist_test.sh configs/world_model/ii_generate_world.py TO/CKPTS

🎥 Visualization

We provide a simple visualization to visualize the high-level control (utilize different cmd) of the world generation.

python tools/generate.py configs/world_model/ii_generate_world.py ckpts/ii_generate_world.pth \
--generate_path generate_output --generate_scene_name scene-0564 --generate_frame 12 --task_mode high-level-control

Also, you can visualize the generated world with the fine-grained control (utilize different transformation matrix)

python tools/generate.py configs/world_model/ii_generate_world.py ckpts/ii_generate_world.pth \
--generate_path generate_output --generate_scene_name scene-0270 --generate_frame 12 --task_mode generate

If you want to visualize the 3D occupancy map, you can set --save_npz in the script above, and the generated 3D occupancy npz will be saved in the generate_output folder.

Utilize the following command to visualize the generated 3D occupancy map:

python tools/vis_occ_3d.py --vis-single-data \PATH/TO/GENERATED/3D_OCCUPANCY.npz --vis-path demo_output

More visualization options can be found in the tools/vis_occ_3d.py file.

Acknowledgement

Thanks to the following excellent projects:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌍 $I^2$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting

arXiv

🚀 News

🛠️Environment

🤗 Model Zoo

📦 Prepare Dataset

🎇 Training and Evaluation

🎥 Visualization

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
configs		configs
mmdet3d		mmdet3d
tools		tools
visualizer		visualizer
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

lzzzzzm/II-World

Folders and files

Latest commit

History

Repository files navigation

🌍 $I^2$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting

arXiv

🚀 News

🛠️Environment

🤗 Model Zoo

📦 Prepare Dataset

🎇 Training and Evaluation

🎥 Visualization

Acknowledgement

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages