This repository contains a Python script that utilizes the fine-tuned Whisper model to transcribe Quranic recitations from audio files. The Whisper model used in this project has been fine-tuned specifically on Quranic recitation, enabling accurate transcription of Arabic text from audio recordings.
- Quran-Specific Transcription: The model is fine-tuned on Quranic recitations, providing high accuracy in transcribing Arabic verses.
- Audio Input Handling: Accepts standard audio file formats (e.g., MP3) and processes them for transcription.
- TorchScript Conversion: The model is also converted to TorchScript for easier deployment in production environments.
- Python 3.7+
torch
transformers
librosa
- Clone the repository:
git clone https://github.com/yourusername/quran-whisper-transcription.git
- Install the required packages:
pip install torch transformers librosa
-
Load the model and processor:
from transformers import WhisperProcessor, WhisperForConditionalGeneration import torch import librosa processor = WhisperProcessor.from_pretrained("tarteel-ai/whisper-tiny-ar-quran") model = WhisperForConditionalGeneration.from_pretrained("tarteel-ai/whisper-tiny-ar-quran")
-
Transcribe an audio file:
audio_path = 'path_to_your_audio.mp3' audio_data, sampling_rate = librosa.load(audio_path, sr=16000) audio = torch.tensor(audio_data) inputs = processor(audio, return_tensors="pt", sampling_rate=16000) generated_ids = model.generate(inputs["input_features"]) transcription = processor.batch_decode(generated_ids, skip_special_tokens=True) print("Transcription:", transcription)
-
TorchScript Conversion (Optional):
scripted_model = torch.jit.trace(model, torch.randn(1, 80, 3000)) scripted_model.save("whisper_tiny_ar_quran_scripted.pt")
- Ensure the audio file is correctly preprocessed (resampled to 16 kHz) for accurate transcription.
- The TorchScript model may have limitations due to unsupported operations in the original model.
- Tarteel.ai for providing the fine-tuned Whisper model for Quranic recitations.
- Hugging Face for the Transformers library.