AI Vision Agent using VideoSDK and Google Gemini Vision API

This project integrates VideoSDK, OpenAI Realtime APIs and Gemini Vision API to analyse screenshare stream in realtime

Start with the project

git clone https://github.com/videosdk-community/videosdk-gemini-vision-agent.git

cd videosdk-gemini-vision-agent

Client Setup

Navigate to client dir:
```
cd client
```
Make a copy of the environment configuration file:
```
cp .env.example .env
```

Create a .env file in the client folder with:

VITE_VIDEOSDK_TOKEN=your_videosdk_auth_token_here

Obtain your VideoSDK Auth Token from app.videosdk.live

Server Setup (Python FastAPI)

Create Virtual Environment (from project root):

python -m venv .venv

Create a virtual environment:

Install Dependencies:

pip install -r requirements.txt

Create Server Environment File (in project root):

cp .env.example .env

Add these keys to your .env file:

OPENAI_API_KEY=your_openai_key_here
GEMINI_API_KEY=your_gemini_api_key

🔑 Obtaining API Keys

OpenAI: https://platform.openai.com/api-keys
Gemini: https://aistudio.google.com/apikey
VideoSDK Token: https://app.videosdk.live

▶️ Running the Application

Start the Server (From Project Root):

uvicorn app:app

Start the Client (From /client Folder):

npm run dev

For more information, check out docs.videosdk.live.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
agent		agent
client		client
intelligence/openai		intelligence/openai
rtc/videosdk		rtc/videosdk
utils/struct		utils/struct
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Vision Agent using VideoSDK and Google Gemini Vision API

Start with the project

Client Setup

Server Setup (Python FastAPI)

▶️ Running the Application

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

videosdk-community/videosdk-gemini-vision-agent

Folders and files

Latest commit

History

Repository files navigation

AI Vision Agent using VideoSDK and Google Gemini Vision API

Start with the project

Client Setup

Server Setup (Python FastAPI)

▶️ Running the Application

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages