PipeCat Voice Agent

PipeCat Voice Agent is an AI-powered voice communication system that enables intelligent, real-time phone conversations. The system integrates various technologies to provide a seamless voice interaction experience:

Core Features

Real-time Voice Processing: Utilizes WebSocket connections for bidirectional audio streaming
Speech Recognition: Integrates with Deepgram for accurate speech-to-text conversion
Natural Language Processing: Leverages OpenAI's GPT-4 for intelligent conversation handling
Text-to-Speech: Uses Cartesia for high-quality voice synthesis
Telephony Integration: Built on Plivo's platform for reliable call handling
Call Recording: Supports automatic call recording and storage
Voice Activity Detection: Implements Silero VAD for precise speech detection

Technical Stack

Backend: FastAPI (Python)
Voice Processing: PipeCat Pipeline System
External Services:
- Plivo (Telephony)
- OpenAI GPT-4 (LLM)
- Deepgram (STT)
- Cartesia (TTS)
- S3 (Recording Storage)

Local Development with Plivo and ngrok

To enable Plivo to reach your local development server, follow these steps:

Install ngrok:

npm install -g ngrok
# or
brew install ngrok

Start your FastAPI server:
```
uvicorn app.main:app --reload
```
Start ngrok tunnel (in a new terminal):
```
ngrok http 8000
```
Copy the HTTPS URL provided by ngrok (e.g., https://your-tunnel.ngrok.io)
Update your environment variables:
```
export BASE_URL=your-ngrok-url
```
Configure Plivo:
- Log into your Plivo dashboard
- Update your application's Answer URL to: {ngrok-url}/api/v1/calls/answer
- Update your application's Hangup URL to: {ngrok-url}/api/v1/calls/hangup

Now Plivo will be able to reach your local server through the secure ngrok tunnel for both HTTP callbacks and WebSocket connections.

Important Notes

The ngrok URL changes each time you restart ngrok (unless you have a paid plan)
Make sure to update your BASE_URL environment variable with the new ngrok URL each time
For production, replace the ngrok URL with your actual domain name

Testing

To test the integration:

Make sure your server is running and ngrok is active

Try making an outbound call using the API endpoint:

curl -X POST "http://localhost:8000/api/v1/calls/outbound/{phone-number}"

The call should connect and establish a WebSocket connection for real-time voice processing

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
runtime.txt		runtime.txt
todo.txt		todo.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PipeCat Voice Agent

Core Features

Technical Stack

Local Development with Plivo and ngrok

Important Notes

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

navjotdhanawat/py-ai-voice-agent

Folders and files

Latest commit

History

Repository files navigation

PipeCat Voice Agent

Core Features

Technical Stack

Local Development with Plivo and ngrok

Important Notes

Testing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages