This project utilizes Facebook's Wav2Vec2 model to perform Automatic Speech Recognition (ASR), converting audio data into text. ASR is the technology used to transcribe spoken language into written text. This repository provides a straightforward way to transcribe audio content accurately.
- ASR with Facebook's Wav2Vec2 model for high transcription accuracy.
- Support for various audio formats.
- Pre-processing utilities for audio data.
- User-friendly and efficient for transcription tasks.
- Python 3.x
- PyTorch
- Librosa
- Transformers library from Hugging Face
- Facebook's Wav2Vec2 Pretrained Model
- Clone this repository.
- Install the required Python packages.
- Place your audio file in the repository.
- Run the code in Google Colab or Anaconda Env.
- The transcribed text will be the output.
This project is under the MIT License.
- Hugging Face Transformers
- Facebook Wav2Vec2 Model