Skip to content

DobriniaMerk/music-snippet-tts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

music-snippet-tts

Why have i made this? I don't know.

Why i am publishing sources? I have no idea, maybe someone will find this intresting.

Does this thing has any practical use? Most probably not.

That's it.

Installing and usage

I anyone would want to try this

To install, you should get yourself an Genius API key, clone this repository, install dependecies from requirements.txt and install SoX

# Clone this repo
git clone https://github.com/DobriniaMerk/music-snippet-tts.git

# Install dependecies
pip install -r requirements.txt

# Install SoX
sudo apt install sox # on linux

On Windows, to install SoX you sould download it from Sourceforge and add to PATH.

Also ffmpeg needed for pywhispercpp to work. Installation process for it left as an excercise for the reader. i'm too tired to search for it

After all this, you should get yoursef a Genius API token and put it in the genius.py.

That's it.

Description

This is some strange type of TTS (text to speech) algorithm i accidentally thought of and decided to make. Instead of generating audio with neural networks or synthesyzing it with some complex algorithms, this program makes speech by combining together pieces of random songs which contain needed words. Resulting audio is not very understandable, though.

That's it.

Realization

First things first, you need to find from which songs to steal the words. Here it is done with use of Genius, one of the biggest lyrics database. As i haven't wanted to do everything by hand with requests and so on, i used the LyricsGenius Python library, one of good thing about it, is that it can download full lyrics for the song, feature that is strangely not in the Genius API.

Then, after the songs are found, i download their audio from YouTube, with the youtube-dl.

Downloaded song than are transcribed with Whisper AI model, to find where in the treck are desired words. To make things faster (but still not fast enough on my machine) i used whisper.cpp Python wrapper: pywhispercpp

And finaly the audio is cut and combined by SoX, audio processing command line multitool.

That's it.

I spent too much time on this thing.

About

Generating speech by cobining snippets from random songs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages