Quran Whisper Transcription

This repository contains a Python script that utilizes the fine-tuned Whisper model to transcribe Quranic recitations from audio files. The Whisper model used in this project has been fine-tuned specifically on Quranic recitation, enabling accurate transcription of Arabic text from audio recordings.

Features

Quran-Specific Transcription: The model is fine-tuned on Quranic recitations, providing high accuracy in transcribing Arabic verses.
Audio Input Handling: Accepts standard audio file formats (e.g., MP3) and processes them for transcription.
TorchScript Conversion: The model is also converted to TorchScript for easier deployment in production environments.

Getting Started

Prerequisites

Python 3.7+
torch
transformers
librosa

Installation

Clone the repository:

git clone https://github.com/yourusername/quran-whisper-transcription.git

Install the required packages:
```
pip install torch transformers librosa
```

Usage

Load the model and processor:

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
import librosa

processor = WhisperProcessor.from_pretrained("tarteel-ai/whisper-tiny-ar-quran")
model = WhisperForConditionalGeneration.from_pretrained("tarteel-ai/whisper-tiny-ar-quran")

Transcribe an audio file:

audio_path = 'path_to_your_audio.mp3'
audio_data, sampling_rate = librosa.load(audio_path, sr=16000)
audio = torch.tensor(audio_data)

inputs = processor(audio, return_tensors="pt", sampling_rate=16000)
generated_ids = model.generate(inputs["input_features"])

transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
print("Transcription:", transcription)

TorchScript Conversion (Optional):

scripted_model = torch.jit.trace(model, torch.randn(1, 80, 3000))
scripted_model.save("whisper_tiny_ar_quran_scripted.pt")

Notes

Ensure the audio file is correctly preprocessed (resampled to 16 kHz) for accurate transcription.
The TorchScript model may have limitations due to unsupported operations in the original model.

Acknowledgments

Tarteel.ai for providing the fine-tuned Whisper model for Quranic recitations.
Hugging Face for the Transformers library.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
backend		backend
.env		.env
.gitignore		.gitignore
README.md		README.md
Untitled11.ipynb		Untitled11.ipynb
app.py		app.py
libomp140.x86_64.dll		libomp140.x86_64.dll
realtime-recogniton.py		realtime-recogniton.py
requirements.txt		requirements.txt
transcription.txt		transcription.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quran Whisper Transcription

Features

Getting Started

Prerequisites

Installation

Usage

Notes

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Mafia-Deadend/-Quran-SpeechToText-using-Whisper-ai-

Folders and files

Latest commit

History

Repository files navigation

Quran Whisper Transcription

Features

Getting Started

Prerequisites

Installation

Usage

Notes

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages