Whisper Audio Transcription App

A Windows GUI application for transcribing audio files using OpenAI's Whisper speech recognition model locally on your machine.

Features

Transcribe audio files (MP3, WAV, M4A, FLAC, OGG, MP4)
Select from different Whisper model sizes (tiny, base, small, medium, large)
Choose between CPU or GPU processing (if CUDA is available)
Progress indicator during transcription
View raw transcript results
Convert to Markdown with live preview
Export as raw text or Markdown files
Custom save location

Installation

Clone or download this repository
Create and activate a virtual environment:

python -m venv whisper_env
whisper_env\Scripts\activate

Install required dependencies:

pip install -r requirements.txt

Note: The Whisper package is installed directly from the GitHub repository to ensure compatibility.

Creating the Executable

Make sure you have PyInstaller installed:

pip install pyinstaller

Create the executable using the provided spec file:

pyinstaller whisper_transcribe.spec

The executable will be created in the dist folder. You can create a shortcut to dist/WhisperTranscribe/WhisperTranscribe.exe on your desktop.

Note: The first time you run the executable, it will download the Whisper model files. This might take a few minutes depending on your internet connection and the model size you select.

Usage

Run the application (either through Python or the executable)
Select an audio file to transcribe
Choose the Whisper model size and processing device
Set your preferred save directory (optional)
Click "Transcribe Audio" and wait for the process to complete
View the results in the Raw Transcript tab
Optionally convert to Markdown with the "Prettify to Markdown" button
Export the transcription as a text or Markdown file

Requirements

Python 3.7+
PyQt6
OpenAI Whisper (from GitHub)
PyTorch
Markdown

Notes

Larger models provide better transcription quality but require more memory and processing time
GPU acceleration significantly improves processing speed for larger models
The application creates "transcribed_text" and "uploaded_audio" directories in the application folder
The executable includes all necessary dependencies and will work on any Windows system

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
whisper-file-transcribe-16.ico		whisper-file-transcribe-16.ico
whisper-file-transcribe-256.ico		whisper-file-transcribe-256.ico
whisper-file-transcribe-32.ico		whisper-file-transcribe-32.ico
whisper-file-transcribe-48.ico		whisper-file-transcribe-48.ico
whisper-file-transcribe.png		whisper-file-transcribe.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Whisper Audio Transcription App

Features

Installation

Creating the Executable

Usage

Requirements

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

dleon86/whisper_file_transcribe

Folders and files

Latest commit

History

Repository files navigation

Whisper Audio Transcription App

Features

Installation

Creating the Executable

Usage

Requirements

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages