Skip to content

This project utilizes OpenAI’s Whisper to transcribe audio files, offering a simple interface for selecting audio files and choosing the desired Whisper model. Now supporting batch transcription, it calculates audio duration and transcription time, saving results to text files with descriptive filenames.

License

Notifications You must be signed in to change notification settings

tolmme/Whisper-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Whisper AI Transcription Project

This project utilizes OpenAI's Whisper to transcribe audio files. It now supports batch processing, allowing users to transcribe multiple files at once and providing real-time updates on the transcription progress.

Features

  • User-friendly interface for selecting multiple audio files for transcription.
  • Transcription of audio files using various Whisper models.
  • Real-time progress updates during transcription, indicating the number of files processed and remaining.
  • Calculation of audio duration and transcription time.
  • Saving transcriptions to text files with detailed filenames.

Requirements

  • Python 3.x
  • whisper
  • pydub
  • tkinter (for file dialog on macOS)

Installation

  1. Clone the repository:

    git clone https://github.com/your-username/whisper-ai-transcription.git
    cd whisper-ai-transcription
  2. Install the required Python packages:

    pip install whisper pydub
  3. Install ffmpeg:

    • macOS:
      brew install ffmpeg

Usage

  1. Run the script:

    python transcribe.py
  2. Follow the on-screen instructions to select an audio file and choose a Whisper model.

Batch Transcription

  • The application now supports selecting multiple audio files for simultaneous transcription.
  • Users will receive progress updates after each file is processed, which helps in tracking the transcription status especially when dealing with large batches of audio files.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

This project utilizes OpenAI’s Whisper to transcribe audio files, offering a simple interface for selecting audio files and choosing the desired Whisper model. Now supporting batch transcription, it calculates audio duration and transcription time, saving results to text files with descriptive filenames.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages