π DepthFlow Modal Integration Overview
This repository provides a minimal Python interface to run DepthFlow β the soul of this project β on Modal's serverless GPU infrastructure.
Think of this as bringing the magic of Immersity AI to the open-source world β powered by DepthFlow, crafted by BrokenSource, and simply deployed via a script anyone can run.
All credit for the core functionality goes to DepthFlow, a remarkable open-source tool for image-to-video transformation using motion and depth inference. This repo merely wraps it in a Modal deployment for ease of use and scaling. Make sure to check his repo DepthFlow and his website Brokensrc β¨ Features
-
βοΈ Batch Processing β Convert multiple images into videos using DepthFlow with GPU acceleration.
-
π Web Interface β Gradio-powered GUI for easy access and real-time previews.
-
π¦ Serverless Scaling β Run on Modalβs on-demand infrastructure with parallel processing.
-
π Logging β Track processed files and errors via structured logs.
π§ Requirements
-
Python 3.12
-
Modal account with CLI installed
-
NVIDIA GPU (T4 recommended for now)
π Scripts
- depthflow_bulk.py
- Batch-converts PNG images in /data/images to MP4 videos using DepthFlow.
πΉ Usage
- Place your PNG images in the /data/images directory.
Run the script:
modal run depth_bulk.py
- Processed videos will be saved in /data/videos.
β Highlights
-
Automatically skips already processed images.
-
Logs success and errors to /data/logs.
-
Customizable hardware allocation (CPU, GPU, memory).
- depthflow_gui.py
- Launches a Gradio web interface for DepthFlow.
πΉ Usage
Run the script:
modal serve depthflow_gui.py
β Highlights
-
Real-time image-to-video interface.
-
Supports concurrent users and container scaling.
βοΈ Modal Configuration
-
Both scripts use a pre-built Modal container with the following:
-
depthflow==0.9.0.dev1
-
torch==2.6.0 (CUDA 12.4)
-
Tools: wget, git, ffmpeg
π€ Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request for improvements or fixes. π License
This project is licensed under the MIT License. See the LICENSE file for details. π Acknowledgments
π₯ DepthFlow β the soul of this project. Without it, there is no magic. Like Immersity AI, but open-source and written by BrokenSource.
βοΈ Modal β for enabling seamless, serverless GPU computing.