Simple Voice Assistant App

A FastAPI-based voice assistant that processes audio input, transcribes it using OpenAI's Whisper, generates responses using GPT-4, and converts the response back to speech using OpenAI's TTS.

Features

Audio file upload and processing
Speech-to-text transcription using OpenAI Whisper
AI-powered responses using GPT-4
Text-to-speech conversion using OpenAI TTS
CORS-enabled for frontend integration

Prerequisites

Python 3.7 or higher
OpenAI API key
pip (Python package installer)

Installation

Clone the repository

git clone <repository-url>
cd simple-voice-assistant-app

Create a virtual environment (recommended)

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```
Set up environment variables Or export it directly in your terminal:
```
export OPEN_AI_KEY=your_openai_api_key_here
```

Running the Application

Start the FastAPI server

uvicorn main:app --reload --host 0.0.0.0 --port 8000

The API will be available at
- Local: http://localhost:8000
- API Documentation: http://localhost:8000/docs (Swagger UI)
- Alternative docs: http://localhost:8000/redoc

API Endpoints

POST `/api/process-audio`

Processes an audio file and returns a speech response.

Request:

Method: POST
Content-Type: multipart/form-data
Body: audio file (WAV format recommended)

Response:

Content-Type: audio/mp3
Body: MP3 audio file containing the AI's speech response

Example using curl:

curl -X POST "http://localhost:8000/api/process-audio" \
     -H "accept: audio/mp3" \
     -H "Content-Type: multipart/form-data" \
     -F "audio=@your_audio_file.wav"

Usage Example

Record an audio file (WAV format) with your question or request
Send a POST request to /api/process-audio with the audio file
Receive an MP3 file with the AI's spoken response

Frontend Integration

The API is configured with CORS to allow requests from http://localhost:3000. To integrate with a frontend:

Set up your frontend to run on port 3000
Send audio files to http://localhost:8000/api/process-audio
Handle the returned MP3 audio response

Configuration

CORS Origins: Currently set to http://localhost:3000. Modify line 14 in main.py to add additional origins.
TTS Voice: Currently using "alloy" voice. Available options: alloy, echo, fable, onyx, nova, shimmer
AI Model: Using GPT-4 for responses. You can modify line 33 in main.py to use different models.

Troubleshooting

OpenAI API Key Issues
- Ensure your API key is correctly set in the environment variable
- Verify you have sufficient credits in your OpenAI account
Audio Format Issues
- The app expects audio files in WAV format
- Ensure your audio file is not corrupted
CORS Issues
- If you're running the frontend on a different port, update the CORS configuration in main.py

Dependencies

FastAPI: Web framework for building APIs
OpenAI: Python client for OpenAI API
Uvicorn: ASGI server for running FastAPI (included in requirements)

License

This project is open source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Simple Voice Assistant App

Features

Prerequisites

Installation

Running the Application

API Endpoints

POST `/api/process-audio`

Usage Example

Frontend Integration

Configuration

Troubleshooting

Dependencies

License

About

Uh oh!

Releases

Packages

Languages

prathapbelli/simple-voice-assistant-app

Folders and files

Latest commit

History

Repository files navigation

Simple Voice Assistant App

Features

Prerequisites

Installation

Running the Application

API Endpoints

POST /api/process-audio

Usage Example

Frontend Integration

Configuration

Troubleshooting

Dependencies

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

POST `/api/process-audio`

Packages