Skip to content

SaadARazzaq/Speech-to-Text-Transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Speech-to-Text-Transformer

image

Overview

This project utilizes Facebook's Wav2Vec2 model to perform Automatic Speech Recognition (ASR), converting audio data into text. ASR is the technology used to transcribe spoken language into written text. This repository provides a straightforward way to transcribe audio content accurately.

Features

  • ASR with Facebook's Wav2Vec2 model for high transcription accuracy.
  • Support for various audio formats.
  • Pre-processing utilities for audio data.
  • User-friendly and efficient for transcription tasks.

Getting Started

Prerequisites

  • Python 3.x
  • PyTorch
  • Librosa
  • Transformers library from Hugging Face
  • Facebook's Wav2Vec2 Pretrained Model

Installation

  1. Clone this repository.
  2. Install the required Python packages.

Usage

  1. Place your audio file in the repository.
  2. Run the code in Google Colab or Anaconda Env.
  3. The transcribed text will be the output.

License

This project is under the MIT License.

Acknowledgments

  • Hugging Face Transformers
  • Facebook Wav2Vec2 Model

About

ASR with Facebook's Wav2Vec2 model for accurate πŸŽ™οΈ to πŸ“ conversion.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published