This repository is the official implementation of our ICCV 2025 paper:
"Diffusion-Based Extreme High-speed Scenes Reconstruction with the Complementary Vision Sensor".
We leverage a novel complementary vision sensor, Tianmouc, which outputs high-speed, multi-bit and sparse spatio-temporal difference data together with RGB frames to record extreme high-speed scenes.
Furthermore, we propose a Cascaded Bi-directional Recurrent Diffusion Model (CBRDM) that achieves accurate, sharp, and color-rich video frame reconstruction.
📩 For any questions, please contact Yapeng Meng (myp23@mails.tsinghua.edu.cn).
- ✅ Released model weights, inference scripts, and demo data
- ✅ Released training scripts and raw dataset
- 🔜 Coming soon: compressed multi-exposure-time dataset
See the open issues for planned updates.
The following setup was tested on:
Ubuntu 22.04 LTS • Python 3.10.15 • CUDA 11.8 • RTX 4090 • Conda
conda create -y -n CBRDM python=3.10
conda activate CBRDM
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt
pip install tianmoucvDownload the pretrained models from Google Drive or BaiduYun (code: f3ex).
Unzip and place:
./checkpoints/TianmoucRec_CBRDM/
./demo_data/
The demo code need to decode the raw TianmoucV1 data from .tmdat file, please install tianmoucv first:
pip install tianmoucv
then run the demo through
python demo.py --sample_name VanGogh --device cuda:0
python demo.py --sample_name qrcode_rotate --device cuda:0
python demo.py --sample_name dog_rotate --device cuda:0
python demo.py --sample_name qrcode_shaking --device cuda:0Reconstructed videos will be saved in ./demo_output.
Our original training dataset (without TD/SD compression) exceeds 2 TB.
It can be downloaded from BaiduYun (code: t6jw).
A compressed dataset will be released soon.
Edit your dataset root in the YAML config file under ./config/:
dir: /your_dataset_root/Tianmouc_dataset_X4K1000_new1_hdf5/train
dir: /your_dataset_root/Tianmouc_dataset_SportsSloMo1_denoise_h5py/train
dir: /your_dataset_root/Tianmouc_dataset_GoPro1_denoise_h5py/train
dir: /your_dataset_root/Tianmouc_dataset_SportsSloMo1_denoise_h5py/valid(a) Non-Recurrent Base Model
A lightweight version without bi-directional recurrent blocks for faster training and lower memory:
BASE_CKPT_DIR=./checkpoints \
CUDA_VISIBLE_DEVICES=7 \
python train_TMRec_multiGPU.py \
--config config/first_stage_no_recurrent.yaml \
--no_wandb(b) Bi-Directional Recurrent Version (BRDM)
Includes the recurrent block proposed in our paper.
If memory is limited, decrease select_divs in the config file:
BASE_CKPT_DIR=./checkpoints \
MASTER_ADDR="localhost" \
MASTER_PORT="12356" \
WORLD_SIZE=4 \
CUDA_VISIBLE_DEVICES=4,5,6,7 \
python train_TMRec_multiGPU.py \
--config config/first_stage_BRDM.yaml \
--no_wandbWe recommend initializing from the first-stage non-recurrent checkpoint to accelerate convergence.
Set in sr_stage_from_base.yaml:
unet_pretrained_path: /path/to/first_stage_no_recurrent_checkpoint.binThen launch:
BASE_CKPT_DIR=./checkpoints \
MASTER_ADDR="localhost" \
MASTER_PORT="12356" \
WORLD_SIZE=4 \
CUDA_VISIBLE_DEVICES=4,5,6,7 \
python train_TMRec_multiGPU.py \
--config config/sr_stage_from_base.yaml \
--no_wandbWe thank the authors of Marigold for providing the original framework that inspired our modifications.



