A real-time voice AI agent built with Groq API that enables natural voice conversations with configurable AI models, voices, and system prompts.
voice-agent.mp4
This application demonstrates real-time voice interaction using Groq API for ultra-fast speech-to-text, AI inference, and text-to-speech capabilities. Built as a complete, end-to-end template that you can fork, customize, and deploy.
Key Features:
- Real-time Voice Conversations: Talk naturally with an AI agent using voice activity detection (VAD)
- Visual Flow Diagram: Interactive node-based visualization showing the voice processing pipeline in real-time
- Configurable AI Models: Choose from multiple Groq models including Llama 4 Maverick and Scout variants
- Multiple TTS Voices: Select from 19 different PlayAI voices for personalized responses
- Custom System Prompts: Easily customize the AI's personality and behavior
- Microphone Selection: Support for multiple audio input devices
- Conversation History: Maintains context across multi-turn conversations
- Sub-second response times, efficient concurrent request handling, and production-grade performance powered by Groq
Tech Stack:
- Frontend: Svelte 5, TypeScript, Tailwind CSS, Vite
- UI Components: Shadcn/ui components for Svelte
- Voice Processing: Voice Activity Detection (VAD) with @ricky0123/vad-web
- Audio Handling: Custom TTS audio buffer with streaming support
- Flow Visualization: @xyflow/svelte for interactive node diagrams
- AI Infrastructure: Groq API (Speech-to-Text, LLM, Text-to-Speech)
Voice Processing Pipeline:
- Microphone Input → Voice Activity Detection (VAD)
- Speech Recording → Groq Whisper (Speech-to-Text)
- Text Processing → Groq LLM (AI Inference)
- Response Generation → Groq PlayAI (Text-to-Speech)
- Audio Output → Streaming audio playback
- Node.js 18+ and npm
- Groq API key (Create a free GroqCloud account and generate an API key here)
-
Clone the repository
git clone https://github.com/benank/groq-voice-agent-template cd groq-voice-agent-template
-
Install dependencies
npm install
-
Start the development server
npm run dev
-
Open your browser Navigate to http://localhost:5173 and start talking to your AI agent!
- Add API Key: Click "Add API Key" and enter your Groq API key
- Configure Settings: Select your preferred microphone, voice, AI model, and system prompt
- Start Conversation: Click the play button on the "Start" node in the flow diagram
- Talk Naturally: Speak into your microphone - the AI will respond with voice
- Visual Feedback: Watch the real-time flow diagram showing the processing pipeline
This template is designed to be a foundation for you to get started with. Key areas for customization:
- Model Selection: Update the AI model configuration in the AI Model dropdown
- Voice Selection: Choose from 19 different PlayAI voices
- System Prompts: Customize the AI's behavior by editing the system prompt in the UI
- Create your free GroqCloud account: Access official API docs, the playground for experimentation, and more resources via Groq Console.
- Build and customize: Fork this repo and start customizing to build out your own application.
- Get support: Connect with other developers building on Groq, chat with our team, and submit feature requests on our Groq Developer Forum.
- See enterprise capabilities: This template showcases production-ready AI that can handle realtime business workloads.
- Discuss Your needs: Contact our team to explore how Groq can accelerate your AI initiatives.
This project is licensed under the MIT License - see the LICENSE file for details.
Created by Julian Francisco.