Speech-to-Text-Transformer

Overview

This project utilizes Facebook's Wav2Vec2 model to perform Automatic Speech Recognition (ASR), converting audio data into text. ASR is the technology used to transcribe spoken language into written text. This repository provides a straightforward way to transcribe audio content accurately.

Features

ASR with Facebook's Wav2Vec2 model for high transcription accuracy.
Support for various audio formats.
Pre-processing utilities for audio data.
User-friendly and efficient for transcription tasks.

Getting Started

Prerequisites

Python 3.x
PyTorch
Librosa
Transformers library from Hugging Face
Facebook's Wav2Vec2 Pretrained Model

Installation

Clone this repository.
Install the required Python packages.

Usage

Place your audio file in the repository.
Run the code in Google Colab or Anaconda Env.
The transcribed text will be the output.

License

This project is under the MIT License.

Acknowledgments

Hugging Face Transformers
Facebook Wav2Vec2 Model

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
notebook.ipynb		notebook.ipynb
sample_audio.wav		sample_audio.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speech-to-Text-Transformer

Overview

Features

Getting Started

Prerequisites

Installation

Usage

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

SaadARazzaq/Speech-to-Text-Transformer

Folders and files

Latest commit

History

Repository files navigation

Speech-to-Text-Transformer

Overview

Features

Getting Started

Prerequisites

Installation

Usage

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages