This UI provides a debug console for real-time AI voice interactions. It works with multiple realtime models (View supported models). Features include:
- Cost Tracking: Know how much you've spent per voice interaction
- Model Support: Supports open-source (Orpheus 3B) and closed-source S2S models like OpenAI's GPT-4o Realtime (adding more soon!)
- Voice and Chat UI
- Session history and recording
Inspired by openai-realtime-console and openai-realtime-agents.
-
Get your API keys:
-
Set up environment:
cp .env.example .env # Add your API keys to .env: # OPENAI_API_KEY="<your-openai-key>" # OUTSPEED_API_KEY="<your-outspeed-key>"
-
Install and run:
npm install npm run dev
Visit http://localhost:3000 to access the console.
To modify agent prompt and tools, modify agent-config.js
.
To modify the model parameters like (voice, version, etc.), edit model-config.js
There are three voice agent examples already present. You can choose them in the Session Config UI on the right before starting a session.
- Dental Agent: Answers callers' questions about working hours of a dental clinic
- Message Agent: Takes callers' messages for a person
- Recruiter Agent: Talks to a candidate and asks questions about their background and availability
You can see the prompts in ./src/agent-config.ts file.
- Orpheus 3B (hosted by Outspeed)
- OpenAI Realtime models
- Moshi (Coming Soon)
- Gemini Multimodal Live (Coming Soon)
You can see the deployment your agent to Cloudflare by following the steps at demo.outspeed.com/deploy, or you could run this locally and visit the deploy route.
MIT