This project is an AI Voice Agent that uses VideoSDK for video conferencing, Deepgram for speech-to-text (STT), and OpenAI for language model (LLM) capabilities. The AI Copilot can join meetings, transcribe speech, and respond intelligently.
git clone https://github.com/videosdk-community/videosdk-deepgram-voice-agent
cd videosdk-deepgram-voice-agent
- Navigate to
client
dir:cd client
- Make a copy of the environment configuration file:
cp .env.example .env
- Set the
VITE_APP_AUTH_TOKEN
in the.env
file with your VideoSDK auth token from app.videosdk.live.
-
Configure the following environment variables in the
.env
file:ROOM_ID=... AUTH_TOKEN=... # (app.videosdk.live) LANGUAGE=... DEEPGRAM_API_KEY=... # (console.deepgram.com) LLM_API_KEY=... # (platform.openai.com/api-keys)
-
Create a virtual environment:
python -m venv venv
-
Activate the virtual environment:
- On Unix or MacOS:
source venv/bin/activate
- On Windows:
.\venv\Scripts\activate
- On Unix or MacOS:
Generate a room ID on the client side and add it to the Python configuration. This can be done by running the client application and using the generated room ID in the .env
file for the Python setup.
For more information, check out docs.videosdk.live.