A Python tool to find when a specific phrase is spoken in a video using OpenAIβs Whisper.
It transcribes your video once (caching the result) and allows you to search for phrases laterβfast!
- β Transcribes video audio using Whisper (tiny model for speed)
- π¦ Caches transcript automatically for faster repeated queries
- π Returns precise timestamps (seconds) where your query phrase is said
- π§ Supports fuzzy/approximate matching for inexact phrases
- π§Ή Includes utility to clear cached transcripts
- Python 3.x
- openai/whisper
- re (regular expressions)
- difflib
- pickle
- OS
Install required Python packages:
pip install git+https://github.com/openai/whisper.git
pip install torch
Clone or download this repo, or just copy the script into your project.
1οΈβ£ First time:
- Loads Whisper βtinyβ model for fast transcription
- Saves the transcript as a
.pkl
file for caching
2οΈβ£ On later runs:
- Loads the cached transcript instantly
- Searches text segments for your query
3οΈβ£ Matching:
- Exact or substring matches
- Fuzzy matching for long phrases
- Partial word matches for multi-word queries
from your_module_name import get_timestamp_for_query
video_path = "your_video.mp4"
query = "your search phrase"
timestamp = get_timestamp_for_query(video_path, query)
print(f"Found at: {timestamp} seconds")
If you want to clear the cached transcript:
from your_module_name import clear_cache
clear_cache("your_video.mp4")
β See it in action (link your video here):
Md Shahriar Rahman Bhuiyan
π LinkedIn Profile
If you have feedback or want to collaborate, feel free to reach out via LinkedIn!