Start building interactive voice experiences with Deepgram's Voice Agent API using Python Flask starter application. This project demonstrates how to create a voice agent that can engage in natural conversations using Deepgram's advanced AI capabilities.
Deepgram's voice AI platform provides APIs for speech-to-text, text-to-speech, and full speech-to-speech voice agents. Over 200,000+ developers use Deepgram to build voice AI products and features.
Before you start, it's essential to generate a Deepgram API key to use in this project. Sign-up now for Deepgram and create an API key.
- Python 3.8 or higher
- Deepgram API key
- Modern web browser with microphone support
- Port Audio installed locally
Follow these steps to get started with this starter application.
-
Go to GitHub and clone the repository.
-
Install dependencies:
pip install -r requirements.txt
- Set your Deepgram API key:
export DEEPGRAM_API_KEY=your_api_key_here
Start the Flask server:
python app.py
Then open your browser and go to:
http://localhost:3000
- Allow microphone access when prompted.
- Speak into your microphone to interact with the Deepgram Voice Agent.
- You should hear the agent's responses played back in your browser.
This application can be modify as needed by using the app-requirements.mdc file. This file allows you to specify various settings and parameters for the application in a structured format that can be use along with Cursor's AI Powered Code Editor.
- Clone or Fork this repo.
- Modify the
app-requirements.mdc
- Add the necessary configuration settings in the file.
- You can refer to the MDC file used to help build this starter application by reviewing app-requirements.mdc
Test the application with:
pytest -v test_app.py
We love to hear from you so if you have questions, comments or find a bug in the project, let us know! You can either:
- Open an issue in this repository
- Join the Deepgram Github Discussions Community
- Join the Deepgram Discord Community
We welcome contributions! Please see our Contributing Guidelines for details.
For security concerns, please see our Security Policy.
Please see our Code of Conduct for community guidelines.
This project is licensed under the MIT license. See the LICENSE file for more info.