Skip to content

malghooneh/AI-Voice-Agent

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepSeek-R1-AI-Voice-Agent

This project enables real-time speech-to-text transcription using AssemblyAI, generates AI responses with DeepSeek R1 (7B model) via Ollama, and converts text responses into speech using ElevenLabs. The entire process happens in real-time, allowing for seamless interaction.

Disclaimer: Using the Assembly ai, you need to add your credit card


🚀 Features

  • Real-time speech-to-text using AssemblyAI
  • AI-powered responses with DeepSeek R1 (7B model) via Ollama
  • Instant text-to-speech conversion with ElevenLabs
  • Live audio streaming for an interactive experience

🛠️ Setup Instructions

Step 1: Sign Up & Install Dependencies

✅ Get API Keys

✅ Install Ollama

DeepSeek R1 is accessed via Ollama. Install Ollama from:
🔗 Download Ollama

✅ Install PortAudio (Required for real-time transcription)

  • Debian/Ubuntu:

    apt install portaudio19-dev

    MacOS:

    brew install portaudio

####✅ Install Python Libraries

Before running the script, install the required dependencies:

pip install "assemblyai[extras]"
pip install ollama
pip install elevenlabs

✅ (MacOS Only) Install MPV for Audio Streaming

brew install mpv

Step 2: Download the DeepSeek R1 Model

Since this script uses DeepSeek R1 via Ollama, download the model locally by running:

ollama pull deepseek-r1:7b

🛠️ Setup with the install.sh script

Alternatively you could use our install.sh script to take care of the setup.

chmod +x install.sh
./install.sh

🎯 Running the Script

Once all dependencies are installed and the model is downloaded, simply run:

python AIVoiceAgent.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 74.9%
  • Shell 25.1%