Skip to content

deepgram-starters/nextjs-voice-agent

Next.js Voice Agent Starter

Start building interactive voice experiences with Deepgram's Voice Agent API using Python Flask starter application. This project demonstrates how to create a voice agent that can engage in natural conversations using Deepgram's advanced AI capabilities.

What is Deepgram?

Deepgram's voice AI platform provides APIs for speech-to-text, text-to-speech, and full speech-to-speech voice agents. Over 200,000+ developers use Deepgram to build voice AI products and features.

Sign-up to Deepgram

Before you start, it's essential to generate a Deepgram API key to use in this project. Sign-up now for Deepgram and create an API key.

Prerequisites

Before you start, you'll need:

Quickstart

Manual Setup

Follow these steps to get started with this starter application.

Clone the repository

Go to GitHub and clone the repository.

Install dependencies

Install the project dependencies:

npm install

Configure your environment

Create a .env file by copying the contents from sample.env:

cp sample.env .env

Then edit the .env file and replace the placeholder with your actual Deepgram API key:

DEEPGRAM_API_KEY=your_deepgram_api_key_here

You can get your API key from the Deepgram Console.

Run the application

There are two ways to run this starter application:

Development Mode:

npm run dev

Web Server Mode:

npm run start

Once running, you can access the application in your browser at http://localhost:3000 (development) or the port specified for server mode.

  • Allow microphone access when prompted.
  • Speak into your microphone to interact with the Deepgram Voice Agent.
  • You should hear the agent's responses played back in your browser.

Browser Compatibility

Firefox Support

This application fully supports Firefox as well as Chrome and Safari. Firefox requires special handling due to its unique Web Audio API implementation:

  • Sample Rate: Firefox uses a default 48kHz sample rate and ignores getUserMedia sample rate constraints, while Chrome/Safari use 24kHz as requested
  • Technical Solution: Audio from Firefox is automatically downsampled from 48kHz to 24kHz before transmission to ensure consistent voice recognition across all browsers
  • User Experience: No differences in functionality - all browsers provide the same real-time voice interaction capabilities. But you'll likley see better performance in Chrome and Safari.

If you see "48kHz" in the browser logs when using Firefox, this is expected behavior.

Using the app-requirements.mdc File

  1. Clone or Fork this repo.
  2. Modify the app-requirements.mdc
  3. Add the necessary configuration settings in the file.
  4. You can refer to the MDC file used to help build this starter application by reviewing app-requirements.mdc

Testing

npm test

Getting Help

We love to hear from you so if you have questions, comments or find a bug in the project, let us know! You can either:

Contributing

We welcome contributions! Please see our Contributing Guidelines for details on how to get started.

Security

For security concerns and vulnerability reporting, please refer to our Security Policy.

Code of Conduct

This project adheres to the Deepgram Code of Conduct. By participating, you are expected to uphold this code.

Author

Deepgram

License

This project is licensed under the MIT license. See the LICENSE file for more info.

About

Get started using Deepgram's Voice Agent with this Next.js demo app

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •