Skip to content

Create your own AI voice assistant for inbound calls using OpenAI GPT, Deepgram, and Twilio. Costs only 1 cent per minute.

Notifications You must be signed in to change notification settings

Barty-Bart/ai-voice-assistant-openai-deepgram

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Low-Cost AI Voice Assistant with OpenAI, Deepgram & Twilio

Build your own AI voice assistant that can handle inbound calls using OpenAI's GPT for conversation, Deepgram for speech processing, and Twilio for telephony - all for around 1 cent per minute!

CleanShot 2024-11-23 at 17 47 04 1

Watch the Tutorial Video

Overview

This repository is Part 1 of a series demonstrating how to build production-ready AI voice assistants. In this first part, we focus on handling inbound calls and basic FAQ responses, achieving:

  • ~1 second latency
  • ~$0.01 per minute cost
  • Natural conversation flow with interruption handling
  • Scalable architecture for future expansion

Coming in Part 2 (stay tuned!):

  • Function calling capabilities
  • Outbound call handling
  • Enhanced text-to-speech with 11.labs
  • And more!

System Architecture

High-Level Architecture

CleanShot 2024-11-24 at 17 45 26

Key Components:

  1. Twilio: Handles inbound calls and audio streaming
  2. Deepgram:
    • Speech-to-Text: Real-time transcription
    • Text-to-Speech: Response generation
  3. OpenAI GPT: Natural language processing and response generation
  4. WebSocket Server: Real-time audio streaming and service orchestration

Code Architecture

CleanShot 2024-11-24 at 17 45 09

The system is built with a modular architecture:

  • app.js: Main server and WebSocket handling
  • services/:
    • gpt-service.js: OpenAI integration and conversation management
    • stream-service.js: Audio streaming and buffer management
    • transcription-service.js: Speech-to-text processing
    • tts-service.js: Text-to-speech conversion

Setup Guide

Prerequisites

  • Node.js (v14+)
  • npm/yarn
  • Accounts with:
    • Twilio
    • Deepgram
    • OpenAI

Installation

  1. Clone the repository:
git clone https://github.com/Barty-Bart/ai-voice-assistant-openai-deepgram.git
cd ai-voice-assistant-openai-deepgram
  1. Install dependencies:
npm install
  1. Create .env file:
SERVER=your-server-domain
DEEPGRAM_API_KEY=your-deepgram-api-key
VOICE_MODEL=your-preferred-voice-model
OPENAI_API_KEY=your-openai-api-key
  1. Configure Twilio:
  • Set up a Twilio phone number
  • Configure webhook to point to your /incoming endpoint
  • Ensure your server has HTTPS (required for Twilio)
  1. Start the server:
npm start

How It Works

  1. Call Initiation:

    • Customer calls Twilio number
    • Twilio establishes WebSocket connection with server
  2. Real-time Processing:

    • Speech-to-Text: Customer audio → Deepgram → Text
    • Processing: Text → OpenAI GPT → Response
    • Text-to-Speech: Response → Deepgram → Audio
    • Audio streamed back to caller
  3. Key Features:

    • Real-time transcription and response
    • Natural conversation handling
    • Interruption detection
    • Ordered message queuing

Future Improvements

  • Implement streaming TTS API from Deepgram for reduced latency
  • Integrate Elevenlabs for enhanced voice quality
  • Add outbound calling capabilities
  • Implement function calling for complex tasks
  • Add more sophisticated conversation handling

Costs & Performance

  • Cost: Approximately 1 cent per minute
    • Significantly lower than commercial alternatives ($5-10 cents/min)
  • Latency: ~1 second response time
    • Can be further optimized with streaming TTS

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Resources

Source Code

This project was built based on Twilio's Call-GPT.

Keywords

ai voice assistant, openai gpt, deepgram, twilio, voice ai, chatbot, conversational ai, speech recognition, text to speech, websocket, nodejs, real-time audio, low-cost ai, inbound calls


Star ⭐ this repository if you find it helpful!

About

Create your own AI voice assistant for inbound calls using OpenAI GPT, Deepgram, and Twilio. Costs only 1 cent per minute.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published