A self-contained version of Piper, the fast and local neural text-to-speech engine.
This version directly embeds espeak-ng and therefore has a GPL license.
Install with:
pip install piper1-tts
List voices with:
python3 -m piper.download_voices
Choose a voice (samples here) and download. For example:
python3 -m piper.download_voices en_US-lessac-medium
This will download to the current directory. Override with --data-dir <DIR>
After downloading the example voice above, run:
python3 -m piper -m en_US-lessac-medium -f test.wav -- 'This is a test.'
This will write test.wav
with the sentence "This is a test."
If you have voices in a different directory, use --data-dir <DIR>
If you have ffplay installed, omit -f
to hear the audio immediately:
python3 -m piper -m en_US-lessac-medium -- 'This will play on your speakers.'
Running Piper this way is slow since it needs to load the model each time. Run the web server unless you need to stream audio (see --output-raw
from --help
).
Some other useful command-line options:
--cuda
- enable GPU acceleration--input-file
- read input text from one or more files--sentence-silence
- add seconds of silence to all but the last sentence--volume
- adjust volume multiplier (default: 1.0)--no-normalize
- disable automatic volume normalization
You can inject raw espeak-ng phonemes with [[ <phonemes> ]]
blocks. For example:
I am the [[ bˈætmæn ]] not [[bɹˈuːs wˈeɪn]]
To get phonemes from espeak-ng, use:
espeak-ng -v <VOICE> --ipa=3 -q <TEXT>
For example:
espeak-ng -v en-us --ipa=3 -q batman
bˈætmæn
Install the necessary dependencies:
python3 -m pip install piper1-tts[http]
After downloading the example voice above, run:
python3 -m piper.http_server -m en_US-lessac-medium
This will start an HTTP server on port 5000 (use --host
and --port
to override).
If you have voices in a different directory, use --data-dir <DIR>
Now you can get WAV files via HTTP:
curl -X POST -H 'Content-Type: application/json' -d '{ "text": "This is a test." }' -o test.wav localhost:5000
The JSON data fields area:
text
(required) - text to synthesizevoice
(optional) - name of voice to use; defaults to-m <VOICE>
speaker
(optional) - name of speaker for multi-speaker voicesspeaker_id
(optional) - id of speaker for multi-speaker voices; overridesspeaker
length_scale
(optional) - speaking speed; defaults to 1noise_scale
(optional) - speaking variabilitynoise_w_scale
(optional) - phoneme width variability
Get the available voices with:
curl localhost:5000/voices
See TRAINING.md
We use scikit-build-core along with cmake and swig to build a Python module that directly embeds espeak-ng.
You will need the following system packages installed (apt-get
):
build-essential
cmake
ninja-build
swig
To create a dev environment:
git clone https://github.com/OHF-voice/piper1-gpl.git
cd piper1-gpl
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -e .[dev]
Next, run script/dev_build
or manually build the extension:
python3 setup.py build_ext --inplace
Now you should be able to use script/run
or manually run Piper:
python3 -m piper --help
You can manually build wheels with:
python3 -m build