Skip to content

Klipar/SpeechToTextConvertor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Speech to Text Converter

This project allows you to convert audio files to text using the speech_recognition library and the Google Web Speech API. There is also an option to format the text to improve readability. Works on Docker.

MIT License Python 3.6

Demo

Demo of project Demo of project

Functionality

  • Supports multiple recognition languages (English, Ukrainian, Slovak, etc.)
  • Convert audio files to text
  • Improved formatting of recognized text

Installation

  1. Reset Docker and Docker Compose.

Before using, make sure that Docker and Docker Compose are installed:

🔹 On Linux (Arch, Ubuntu, Debian):

sudo pacman -S docker docker-compose  # For Arch
sudo apt install docker docker-compose -y  # For Ubuntu/Debian

🔹 On macOS and Windows — install using the official Docker website.

Make sure Docker is working:

docker --version
docker-compose --version
  1. Clone this repository (make sure that you have installed the git):
git clone https://github.com/Klipar/speech_to_text_convertor.git
cd speech_to_text_convertor
  1. Build a Docker image This command builds the container with the project (using a Dockerfile).
docker-compose build
  1. If the installation went smoothly, then congratulations, now you can start using it!

Usage/Examples

  1. Launch the container and immediately enter the terminal to work with the program:
docker-compose run --rm speech-to-text bash
  1. To use, simply enter the following command to start the script execution:
python main.py

You can also specify the name of the file to be transcribed right at startup as follows:

python main.py Path/to/your/audio.file
  1. Select the recognition language.
  2. Specify the path to the audio file (if you haven't already done so when you start the program).
  3. You can format it for easier viewing or skip this step.
  4. Get the text as a .txt file.

Supported formats

  • .mp3
  • .wav
  • .flac
  • .ogg

License

This project is licensed under the MIT License.

About

A simple example of using Google Translate to transcribe audio to text.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published