Skip to content

πŸŽ™οΈ Powerful GUI tool to transcribe and translate audio/video files using Whisper and OpenAI β€” fast, simple, and GPU-optimized.

License

Notifications You must be signed in to change notification settings

jjaruna/autoTranscriptGUI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AutoTranscript GUI πŸŽ™οΈ

AutoTranscript is a powerful, GPU-accelerated subtitle generator built on top of OpenAI's Whisper model. It features both a command-line interface (CLI) and a beautiful CustomTkinter-based GUI for users who prefer a graphical workflow.

Supports:

  • Languages such as: English, Chinese, Japanese, Korean.
  • Local audio/video files
  • Subtitle translation to English
  • OpenAI API (for higher quality translations)

✨ Features

  • πŸ–₯️ Full-featured GUI with progress tracking, real-time logs, and OpenAI config
  • πŸ“œ Generate .srt subtitle files from media files
  • 🌍 Supports multilingual transcription and optional translation to English
  • 🧠 Uses Faster-Whisper for fast GPU-accelerated transcription
  • πŸ” API key manager for OpenAI GPT models

πŸ“Έ GUI Preview

image


🧩 Requirements

  • Python 3.8+
  • ffmpeg (must be installed)
  • NVIDIA GPU with CUDA (recommended)
  • PyTorch with CUDA

Requirements for Releases

  • The rar file contains everything you need to get started without having to install anything.

πŸ“¦ Installation

git clone https://github.com/jjaruna/autoTranscriptGUI.git
cd autoTranscriptGUI
pip install -r requirements.txt

πŸš€ Launch the GUI

python AutoTranscriptGUI.py

πŸ” Whisper Model Comparison Summary

Model VRAM (Min) βš™οΈ Performance 🎯 Use Case 🌐 Translate into English
tiny β‰₯ 1 GB ⚑ Very Fast Quick tests, low-resource devices βœ…
base β‰₯ 2 GB ⚑ Fast Simple transcriptions, short audio βœ…
small β‰₯ 4 GB βš–οΈ Balanced Decent accuracy and speed for general use βœ…
medium β‰₯ 8 GB πŸ•’ Slower High-quality results for longer files βœ…
large-v1 β‰₯ 10 GB 🐒 Slower Older but still strong performer βœ…
large-v2 β‰₯ 10 GB 🐒 Slower More robust, especially with noisy inputs βœ…
large-v3 β‰₯ 12 GB 🐌 Slowest Highest accuracy offline, latest version βœ…
large-v3-turbo β‰₯ 8–10 GB ⚑ Fastest High-speed, high-accuracy, great multilingual support ❌

🧠 Recommendation

After testing the large-v3-turbo model more than 10 times, I can confidently say it is the fastest and most accurate among all Whisper models included in this app.

πŸ–₯️ My system has 4GB of VRAM, and despite being under the recommended VRAM for large models, large-v3-turbo still performed exceptionally well.

⚠️ Note: Your experience may vary depending on your GPU and available VRAM. Use this recommendation as a reference, not a guarantee. If you encounter performance issues, try smaller models like medium or small.


βš™οΈ OpenAI API Setup (Optional)

To enable OpenAI-powered translation:

  1. Click "Add API Key" in the GUI
  2. Enter your OpenAI key and model (gpt-4, gpt-3.5-turbo, etc.)
  3. It will be saved to .env file automatically

πŸ–₯️ CLI Mode (Optional)

You can still use the command-line version via autosub.py:

python autosub.py myvideo.mp4 -l ja --translate --model base

CLI Options

Option Description
filename File path
-l, --language Force language (e.g. en, es, zh)
-t, --translate Translate to English
-o, --openai Use OpenAI API
--model Whisper model to use
--debug Enable debug mode
--keep Keep intermediate WAV file

πŸ“ Output

  • Subtitles are saved as .srt files in the same folder as your media.
  • If translated, original and translated text will be preserved.

πŸ§ͺ Example GUI Workflow

  1. Open GUI
  2. Select video/audio file
  3. Choose language and Whisper model
  4. (Optional) Enable "Translate to English"
  5. (Optional) Enable "Use OpenAI"
  6. Click Start Transcription
  7. Wait for progress bar and logs to finish

πŸ™ Credits


πŸ“„ License

MIT License β€” free for personal and commercial use.

About

πŸŽ™οΈ Powerful GUI tool to transcribe and translate audio/video files using Whisper and OpenAI β€” fast, simple, and GPU-optimized.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages