Skip to content

Official PyTorch implementation for "FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis".

License

Notifications You must be signed in to change notification settings

ML-GSAI/FlexWorld

Repository files navigation

FlexWorld

arXiv deploy deploy

This is the official PyTorch implementation of FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis.

Update

  • [2025-5-21]: Add training code and data preperation.

Installation

For complete installation instructions, please see INSTALL.md.

Usage

Static scene video generation given an image and a camera trajectory:

python video_generate.py --input_image_path ./assets/room.png --output_dir ./results-single-traj

You can pass in traj argument to specify camera movements, the basic movements is defined in "ops/utils/all_traj.py". The supported camera movements includes ["up","down","left","right","forward","backward","rotate_left","rotate_right"].

python video_generate.py --input_image_path ./assets/room.png --output_dir ./results-single-traj --traj backward

You can also generate videos share the same camera trajectories with those in DL3DV and Re10K. Just pass the video path to traj arguments.

python video_generate.py --input_image_path ./assets/room.png --output_dir ./results-single-traj --traj ./path_to_dl3dv/1.mp4

A flexible-view 360° scene generation given an image.

# You are free to modify the corresponding YAML configuration file by name in `./configs/examples`.
python main_3dgs.py --name room2

Visualization

First running:

python 3dgs_viewer.py

then visit 127.0.0.1:8000 to freely explore the generated scene in the current directory. The script will scan the ply file recursively, please doing this after the generation.

Dataset Preperation

  1. Download dataset to local dir following DL3DV repo. You may download only part of them, like 1K.

  2. Prepare 3DGS from DL3DV dataset, you can first download colmap annotation from DL3DV colmap annotation and then do reconstruction following Gaussian Splatting repo. The final output will listed like:

- output/
  - 001dccbc1f78146a9f03861026613d8e73f39f372b545b26118e37a23c740d5f
    - point_cloud
        - iteration_7000
            - point_cloud.ply
  - 0003dc82473fd52c53dcbdc2deb4e6e9c3548d6f8c9b03f9ea8d3c7d3ce33546
    - point_cloud
        - iteration_7000
            - point_cloud.ply
  1. Run following to generate broken video constructed by 3DGS.
# The path here is an example.
python gen_dataset.py --dataset_path ./DL3DV/DL3DV-10K/1K --output_path ./DL3DV/processed --gs_path ./gaussian-splatting/output 
  1. Run following to label the video constructed.
# The path here is an example.
python label_dataset.py --input_path ./DL3DV/processed --output_path ./train_data_v2v

Training

  1. Change following lines in "./tools/CogVideo/configs/sft_v2v.yaml".
args:
  checkpoint_activations: True 
  experiment_name: lora-disney # your save folder name 
  mode: finetune
  load: "xxx/CogVideoX-5B-I2V-SAT/transformer" # path to transformer original checkpoints
  save: "./ckpts_5b" # path to save dir.
  train_data: [ "train_data_v2v" ] # Train data path
  valid_data: [ "train_data_v2v" ] # Validation data path, can be the same as train_data(no recommended)
  1. Run training script
cd ./tools/CogVideo/
bash train_video_v2v.py

ToDo List

  • A user manual for our camera trajectory, offering support for more flexible trajectory inputs and accommodating a wider variety of trajectory types (such as RealEstate camera input and DL3DV-10K camera input).
  • A 3DGS viewer for generated results.
  • Training code for video diffusion model.

Acknowledgement

This work is built on many amazing open source projects, thanks to all the authors!

Citation

@misc{chen2025flexworldprogressivelyexpanding3d,
      title={FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis}, 
      author={Luxi Chen and Zihan Zhou and Min Zhao and Yikai Wang and Ge Zhang and Wenhao Huang and Hao Sun and Ji-Rong Wen and Chongxuan Li},
      year={2025},
      eprint={2503.13265},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.13265}, 
}

About

Official PyTorch implementation for "FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages