AutoTranscript is a powerful, GPU-accelerated subtitle generator built on top of OpenAI's Whisper model. It features both a command-line interface (CLI) and a beautiful CustomTkinter-based GUI for users who prefer a graphical workflow.
Supports:
- Languages such as: English, Chinese, Japanese, Korean.
- Local audio/video files
- Subtitle translation to English
- OpenAI API (for higher quality translations)
- π₯οΈ Full-featured GUI with progress tracking, real-time logs, and OpenAI config
- π Generate
.srt
subtitle files from media files - π Supports multilingual transcription and optional translation to English
- π§ Uses Faster-Whisper for fast GPU-accelerated transcription
- π API key manager for OpenAI GPT models
- Python 3.8+
- ffmpeg (must be installed)
- NVIDIA GPU with CUDA (recommended)
- PyTorch with CUDA
- The rar file contains everything you need to get started without having to install anything.
git clone https://github.com/jjaruna/autoTranscriptGUI.git
cd autoTranscriptGUI
pip install -r requirements.txt
python AutoTranscriptGUI.py
Model | VRAM (Min) | βοΈ Performance | π― Use Case | π Translate into English |
---|---|---|---|---|
tiny |
β₯ 1 GB | β‘ Very Fast | Quick tests, low-resource devices | β |
base |
β₯ 2 GB | β‘ Fast | Simple transcriptions, short audio | β |
small |
β₯ 4 GB | βοΈ Balanced | Decent accuracy and speed for general use | β |
medium |
β₯ 8 GB | π Slower | High-quality results for longer files | β |
large-v1 |
β₯ 10 GB | π’ Slower | Older but still strong performer | β |
large-v2 |
β₯ 10 GB | π’ Slower | More robust, especially with noisy inputs | β |
large-v3 |
β₯ 12 GB | π Slowest | Highest accuracy offline, latest version | β |
large-v3-turbo |
β₯ 8β10 GB | β‘ Fastest | High-speed, high-accuracy, great multilingual support | β |
After testing the large-v3-turbo
model more than 10 times, I can confidently say it is the fastest and most accurate among all Whisper models included in this app.
π₯οΈ My system has 4GB of VRAM, and despite being under the recommended VRAM for large models, large-v3-turbo
still performed exceptionally well.
medium
or small
.
To enable OpenAI-powered translation:
- Click "Add API Key" in the GUI
- Enter your OpenAI key and model (
gpt-4
,gpt-3.5-turbo
, etc.) - It will be saved to
.env
file automatically
You can still use the command-line version via autosub.py
:
python autosub.py myvideo.mp4 -l ja --translate --model base
Option | Description |
---|---|
filename |
File path |
-l , --language |
Force language (e.g. en , es , zh ) |
-t , --translate |
Translate to English |
-o , --openai |
Use OpenAI API |
--model |
Whisper model to use |
--debug |
Enable debug mode |
--keep |
Keep intermediate WAV file |
- Subtitles are saved as
.srt
files in the same folder as your media. - If translated, original and translated text will be preserved.
- Open GUI
- Select video/audio file
- Choose language and Whisper model
- (Optional) Enable "Translate to English"
- (Optional) Enable "Use OpenAI"
- Click Start Transcription
- Wait for progress bar and logs to finish
- Built with OpenAI Whisper
- Powered by Faster-Whisper
- GUI built with CustomTkinter
- Thank you General Koi, for the great help in testing and reviewing the Japanese transcripts.
MIT License β free for personal and commercial use.