Skip to content

OHF-Voice/piper1-gpl

Repository files navigation

Piper 1 (GPL)

A self-contained version of Piper, the fast and local neural text-to-speech engine.

This version directly embeds espeak-ng and therefore has a GPL license.

Installing

Install with:

pip install piper1-tts

Downloading Voices

List voices with:

python3 -m piper.download_voices

Choose a voice (samples here) and download. For example:

python3 -m piper.download_voices en_US-lessac-medium

This will download to the current directory. Override with --data-dir <DIR>

Running

After downloading the example voice above, run:

python3 -m piper -m en_US-lessac-medium -f test.wav -- 'This is a test.'

This will write test.wav with the sentence "This is a test." If you have voices in a different directory, use --data-dir <DIR>

If you have ffplay installed, omit -f to hear the audio immediately:

python3 -m piper -m en_US-lessac-medium -- 'This will play on your speakers.'

Running Piper this way is slow since it needs to load the model each time. Run the web server unless you need to stream audio (see --output-raw from --help).

Some other useful command-line options:

  • --cuda - enable GPU acceleration
  • --input-file - read input text from one or more files
  • --sentence-silence - add seconds of silence to all but the last sentence
  • --volume - adjust volume multiplier (default: 1.0)
  • --no-normalize - disable automatic volume normalization

Raw Phonemes

You can inject raw espeak-ng phonemes with [[ <phonemes> ]] blocks. For example:

I am the [[ bˈætmæn ]] not [[bɹˈuːs wˈe‍ɪn]]

To get phonemes from espeak-ng, use:

espeak-ng -v <VOICE> --ipa=3 -q <TEXT>

For example:

espeak-ng -v en-us --ipa=3 -q batman
bˈætmæn

Web Server

Install the necessary dependencies:

python3 -m pip install piper1-tts[http]

After downloading the example voice above, run:

python3 -m piper.http_server -m en_US-lessac-medium

This will start an HTTP server on port 5000 (use --host and --port to override). If you have voices in a different directory, use --data-dir <DIR>

Now you can get WAV files via HTTP:

curl -X POST -H 'Content-Type: application/json' -d '{ "text": "This is a test." }' -o test.wav localhost:5000

The JSON data fields area:

  • text (required) - text to synthesize
  • voice (optional) - name of voice to use; defaults to -m <VOICE>
  • speaker (optional) - name of speaker for multi-speaker voices
  • speaker_id (optional) - id of speaker for multi-speaker voices; overrides speaker
  • length_scale (optional) - speaking speed; defaults to 1
  • noise_scale (optional) - speaking variability
  • noise_w_scale (optional) - phoneme width variability

Get the available voices with:

curl localhost:5000/voices

Training New Voices

See TRAINING.md

Building Manually

We use scikit-build-core along with cmake and swig to build a Python module that directly embeds espeak-ng.

You will need the following system packages installed (apt-get):

  • build-essential
  • cmake
  • ninja-build
  • swig

To create a dev environment:

git clone https://github.com/OHF-voice/piper1-gpl.git
cd piper1-gpl
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -e .[dev]

Next, run script/dev_build or manually build the extension:

python3 setup.py build_ext --inplace

Now you should be able to use script/run or manually run Piper:

python3 -m piper --help

You can manually build wheels with:

python3 -m build

About

Fast and local neural text-to-speech engine

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published