Insanely Fast Whisper ROCm

This project is designed to provide a high-performance, GPU-accelerated environment for generating transcripts from audio files using AMD hardware and the ROCm platform. The project includes multiple scripts for different use cases, including an automatic file monitoring service (main.py) and a web-based user interface (app.py) built with Gradio. The setup is containerized using Docker and Docker Compose, ensuring a consistent and isolated environment optimized for ROCm.

Badges

Installation

Prerequisites

Docker: Ensure that Docker (version 20.10 or newer) is installed and running on your system.
Docker Compose: Ensure that Docker Compose is installed (comes bundled with Docker Desktop on Windows and Mac, or can be installed separately on Linux).
ROCm: This project requires an AMD GPU that is compatible with ROCm 6.1.2.

Steps

Clone the repository:

git clone https://github.com/beecave-homelab/insanely-fast-whisper-rocm.git
cd insanely-fast-whisper-rocm

Create a .env file:

Create a .env file in the root directory of the project with the necessary configuration. Example:

# Default values for main.py
UPLOADS="uploads"
TRANSCRIPTS="transcripts"
LOGS="logs"
BATCH_SIZE=6
VERBOSE=true
MODEL=distil-whisper/distil-large-v3

# Default values for convert_output.py
CONVERT_OUTPUT_FORMATS="txt,srt"
CONVERT_CHECK_INTERVAL=120
PROCESSED_TXT_DIR="transcripts-txt"
PROCESSED_SRT_DIR="transcripts-srt"

Build the Docker image:
```
docker-compose build
```
Run the Docker container:
```
docker-compose up -d
```

Usage

Gradio Web UI

The app.py script provides a web interface for uploading files and generating transcripts.

Access the web interface:
- Navigate to http://localhost:7862 in your web browser.
Upload an audio file:
- Use the provided interface to upload an audio file. The file will be processed, and the transcript will be generated and displayed in the interface.
View logs:
- Real-time logs are displayed in the web interface, and you can also find them in the /logs directory.

Specifying a Different Script

The docker-compose.yaml file allows you to specify which script from the /src folder should be run by modifying the command line. By default, it runs the /src/app.py script (Gradio Web UI). To run a different script, change the command section in docker-compose.yaml accordingly. For example, to use the automatic uploading service (main.py):

command: ["src/main.py"] # For automatically processing files in the uploads directory.

Automatic Uploading Service

The main.py script monitors a specified directory for new files and automatically generates transcripts. Follow these steps to use this feature:

Start the service:
- Ensure the Docker container is running (docker-compose up -d).
Place files in the /uploads directory:
- Any files added to this directory will be automatically processed, and the transcripts will be placed in the /transcripts directory.
Check logs:
- Logs for the processing will be stored in the /logs directory.

License

This project is licensed under the MIT license. See LICENSE for more information.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
src		src
testing		testing
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
project-v0.2.0.md		project-v0.2.0.md
requirements-onnxruntime-rocm.txt		requirements-onnxruntime-rocm.txt
requirements-torch-rocm.txt		requirements-torch-rocm.txt
requirements.txt		requirements.txt
sudo-nopasswd		sudo-nopasswd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Insanely Fast Whisper ROCm

Table of Contents

Badges

Installation

Prerequisites

Steps

Usage

Gradio Web UI

Specifying a Different Script

Automatic Uploading Service

License

Contributing

About

Uh oh!

Releases

Packages

Languages

License

leinaht/insanely-fast-whisper-rocm

Folders and files

Latest commit

History

Repository files navigation

Insanely Fast Whisper ROCm

Table of Contents

Badges

Installation

Prerequisites

Steps

Usage

Gradio Web UI

Specifying a Different Script

Automatic Uploading Service

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages