Skip to content

Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.

License

Notifications You must be signed in to change notification settings

work-r-labs/workr_cosmos-transfer1

 
 

Repository files navigation

Workr Command

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path workr_assets/isaac_videos/rgb_1.mp4
--video_save_name output_video_test
--controlnet_specs workr_assets/controlnet_specs/workr_vis_test.json

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path workr_assets/isaac_videos/rgb_1.mp4
--video_save_name output_video_1
--controlnet_specs workr_assets/controlnet_specs/workr_vis_1.json

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path workr_assets/isaac_videos/rgb_2.mp4
--video_save_name output_video_2
--controlnet_specs workr_assets/controlnet_specs/workr_vis_2.json

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path workr_assets/isaac_videos/rgb_5.mp4
--video_save_name output_video_5
--controlnet_specs workr_assets/controlnet_specs/workr_vis_5.json

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py --checkpoint_dir checkpoints --input_video_path workr_assets/isaac_videos/rgb_1.mp4 --video_save_name output_video_seg_1 --controlnet_specs workr_assets/controlnet_specs/workr_seg_1.json

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path workr_assets/isaac_videos/rgb_multiscene.mp4
--video_save_name output_video_multiscene
--controlnet_specs workr_assets/controlnet_specs/workr_vis_multiscene.json

jq '.input_video_path="ignore/scene_0_rgb.mp4" | .seg.input_control="ignore/scene_0_instance_segmentation.mp4"' /home/ubuntu/jasper-cosmos-transfer-1/cosmos-transfer1/workr_assets/controlnet_specs/workr_scene_template.json > /home/ubuntu/jasper-cosmos-transfer-1/cosmos-transfer1/workr_assets/controlnet_specs/workr_scene_output.json CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path ignore/scene_0_rgb.mp4
--video_save_name scene_0_cosmos.mp4
--controlnet_specs workr_assets/controlnet_specs/workr_scene_output.json

To Run On N Scene Videos

for i in {0..19}; do # Adjust the JSON file jq ".input_video_path="ignore/scene_${i}rgb.mp4" | .seg.input_control="ignore/scene${i}_instance_segmentation.mp4"" /home/ubuntu/jasper-cosmos-transfer-1/cosmos-transfer1/workr_assets/controlnet_specs/workr_scene_template.json > /home/ubuntu/jasper-cosmos-transfer-1/cosmos-transfer1/workr_assets/controlnet_specs/workr_scene_output.json

# Run the Python script
CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py \
    --checkpoint_dir checkpoints \
    --input_video_path "ignore/scene_${i}_rgb.mp4" \
    --video_save_name "scene_${i}_cosmos.mp4" \
    --controlnet_specs workr_assets/controlnet_specs/workr_scene_output.json

done

for i in {0..19}; do # Adjust the JSON file jq ".input_video_path="ignore/scene_${i}_rgb.mp4"" /home/ubuntu/jasper-cosmos-transfer-1/cosmos-transfer1/workr_assets/controlnet_specs/workr_scene_vis_template.json > /home/ubuntu/jasper-cosmos-transfer-1/cosmos-transfer1/workr_assets/controlnet_specs/workr_scene_vis_output.json

# Run the Python script
CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py \
    --checkpoint_dir checkpoints \
    --input_video_path "ignore/scene_${i}_rgb.mp4" \
    --video_save_name "scene_${i}_vis_cosmos.mp4" \
    --controlnet_specs workr_assets/controlnet_specs/workr_scene_vis_output.json

done

RGB and semantic conditioning

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path ignore/scene_0_rgb.mp4
--video_save_name output_rgb_and_vis_scene_0
--controlnet_specs workr_assets/controlnet_specs/workr_rgb_and_vis_scene_0.json

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path ignore/scene_1_rgb.mp4
--video_save_name output_rgb_and_vis_scene_1
--controlnet_specs workr_assets/controlnet_specs/workr_rgb_and_vis_scene_1.json

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py
--checkpoint_dir checkpoints
--input_video_path ignore/scene_2_rgb.mp4
--video_save_name output_rgb_and_vis_scene_2
--controlnet_specs workr_assets/controlnet_specs/workr_rgb_and_vis_scene_2.json

NVIDIA Cosmos Header

NVIDIA Cosmos is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster. Cosmos contains

  1. Pre-trained models (available via Hugging Face) under the NVIDIA Open Model License that allows commercial use of the models for free.
  2. Training scripts under the Apache 2 License for post-training the models for various downstream Physical AI applications.

Key Features

Cosmos-Transfer1 is a pre-trained, diffusion-based conditional world model designed for multimodal, controllable world generation. It creates world simulations based on multiple spatial control inputs across various modalities, such as segmentation, depth, and edge maps. Cosmos-Transfer1 offers the flexibility to weight different conditional inputs differently at varying spatial locations and temporal instances, enabling highly customizable world generation. This capability is particularly useful for various world-to-world transfer applications, including Sim2Real.

The model is available via Hugging Face. The post-training scripts will be released soon!

Examples

The code snippet below provides a gist of the inference usage.

export CUDA_VISIBLE_DEVICES=0
export CHECKPOINT_DIR="${CHECKPOINT_DIR:=./checkpoints}"
CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py \
    --checkpoint_dir $CHECKPOINT_DIR \
    --video_save_folder outputs/robot_sample \
    --controlnet_specs assets/robot_sample_spec.json \
    --offload_text_encoder_model

robot_sample_input.mp4

robot_sample_output.mp4

Model Family

Model name Description Try it out Supported Hardware
Cosmos-Transfer1-7B World Generation with Adaptive Multimodal Control Inference 80GB H100
Cosmos-Transfer1-7B-Sample-AV Cosmos-Transfer1 for autonomous vehicle tasks Inference 80GB H100

License and Contact

This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.

NVIDIA Cosmos source code is released under the Apache 2 License.

NVIDIA Cosmos models are released under the NVIDIA Open Model License. For a custom license, please contact cosmos-license@nvidia.com.

About

Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%