Veo 3 is Google's state-of-the-art video generation model available in the Gemini API. This repository is a quickstart that demonstrates how to build a simple UI to generate videos with Veo 3, play them, and download the results. It also includes an image + text to video generation using the Imagen 4 model.
Note
If you want a full studio, consider Google's Flow (a professional environment for Veo/Imagen). Use this repo as a lightweight quickstart to learn how to build your own UI that generates videos with Veo 3 via the Gemini API.
(This is not an official Google product.)
- Generate videos from text prompts using the Veo-3 model.
- Generate videos from images + text prompts using the Imagen 4.0 model or upload a starting image.
- Play and download generated videos.
- Cut videos directly in the browser to a specific time range.
Follow these steps to get the application running locally for development and testing.
1. Prerequisites:
- Node.js and npm (or yarn/pnpm)
GEMINI_API_KEY
: The application requires a GEMINI API key. Either create a.env
file in the project root and add your API key:GEMINI_API_KEY="YOUR_API_KEY"
or set the environment variable in your system.
Warning
Google Veo 3 and Imagen 4 are both part of the Gemini API Paid tier. You will need to be on the paid tier to use these models.
2. Install Dependencies:
npm install
3. Run Development Server:
npm run dev
Open your browser and navigate to http://localhost:3000
to see the application.
The project is a standard Next.js application with the following key directories:
app/
: Contains the main application logic, including the user interface and API routes.api/
: API routes for generating videos and images, and checking operation status.
components/
: Reusable React components used throughout the application.lib/
: Utility functions and schema definitions.public/
: Static assets.
- Gemini API docs:
https://ai.google.dev/gemini-api/docs
- Veo 3 Guide:
https://ai.google.dev/gemini-api/docs/video?example=dialogue
- Imagen 4 Guide:
https://ai.google.dev/gemini-api/docs/imagen
The application uses the following API routes to interact with the Google models:
app/api/veo/generate/route.ts
: Handles video generation requests. It takes a text prompt as input and initiates a video generation operation with the Veo-3 model.app/api/veo/operation/route.ts
: Checks the status of a video generation operation.app/api/veo/download/route.ts
: Downloads the generated video.app/api/imagen/generate/route.ts
: Handles image generation requests with the Imagen model.
- Next.js - React framework for building the user interface.
- React - JavaScript library for building user interfaces.
- Tailwind CSS - For styling.
- Gemini API with Veo 3 - For video generation; Imagen - For image generation.
- Want a feature? Please open an issue describing the use case and proposed behavior.
This project is licensed under the Apache License 2.0.