Skip to content

A lightweight text-to-video generator using Hugging Face’s Zeroscope v2 model. Optimized for CPU-only environments, with basic prompt enhancement, caching, and fallback placeholder support for reliable generation. Ideal for short, fast, cinematic videos from text prompts.

Notifications You must be signed in to change notification settings

VyomThaker-2154/VideoGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

title emoji colorFrom colorTo sdk sdk_version python_version app_file pinned license
AI Text-to-Video Ultra-Fast Generator
🎥
purple
blue
gradio
5.43.1
3.10
app.py
false
mit

🎥 AI Text-to-Video Ultra-Fast Generator

This app generates short AI videos (2–5 seconds) from a text prompt using the open-source Zeroscope v2 576w model. It is optimized for CPU-only Hugging Face Spaces, with lightweight caching and prompt improvement techniques for faster and more consistent results.


🚀 How to Use

  1. Enter your prompt in the textbox.
  2. Click Generate Video.
  3. Wait a few seconds to a couple of minutes (CPU speed dependent).
  4. Watch your generated AI video in the player below.

⚠️ On free CPU Spaces, video generation may take longer. Using a GPU significantly speeds up the process.


🛠️ Features

  • CPU-optimized video generation with fewer frames and inference steps.
  • Lightweight in-memory cache for the last 5 prompts.
  • Basic prompt improvement for cleaner, more consistent output.
  • Very short FPS (6) for CPU efficiency.
  • Uses Zeroscope v2 576w diffusion pipeline from Hugging Face.

⚡ Setup Instructions

1. Clone the repository (optional, if running locally)

git clone <your-repo-url>
cd <your-repo-folder>
  1. Install dependencies
pip install --upgrade pip
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install diffusers==0.33.0 transformers accelerate safetensors gradio==5.43.1
  1. Run the app locally (optional)
python app.py
  1. Deploy to Hugging Face Spaces

Upload app.py.

Ensure your requirements.txt contains:

torch
diffusers==0.33.0
transformers
accelerate
safetensors
gradio==5.43.1

No API keys are required.


💡 Tips for Faster Results on CPU

  • Keep prompts short and descriptive.
  • Keep num_frames ≤ 6 for faster CPU generation.
  • Repeating prompts benefits from cache, which makes subsequent generation faster.

⚠️ Notes

  • First generation for a prompt may take slightly longer due to model loading.
  • Free CPU Spaces are slow, expect 1–2 minutes for small videos.
  • In-memory caching does not persist across restarts.
  • This app does not require API keys or GPU to run.

📚 References

About

A lightweight text-to-video generator using Hugging Face’s Zeroscope v2 model. Optimized for CPU-only environments, with basic prompt enhancement, caching, and fallback placeholder support for reliable generation. Ideal for short, fast, cinematic videos from text prompts.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages