Skip to content

Add scripts for exporting Piper TTS models to sherpa-onnx #2299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
181 changes: 181 additions & 0 deletions .github/workflows/export-piper.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
name: export-piper

on:
push:
branches:
- export-piper
workflow_dispatch:

concurrency:
group: export-piper-${{ github.ref }}
cancel-in-progress: true

jobs:
export-piper:
if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'
name: ${{ matrix.index }}/${{ matrix.total }}
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest]
python-version: ["3.10"]
total: ["20"]
index: [
"0", "1", "2", "3", "4", "5", "6", "7", "8", "9",
"10", "11", "12", "13", "14", "15", "16", "17", "18", "19",
]
# total: ["1"]
# index: ["0"]

steps:
- uses: actions/checkout@v4

- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Install Python dependencies
shell: bash
run: |
python3 -m pip install --upgrade pip jinja2 iso639-lang onnx==1.17.0 onnxruntime==1.17.1 sherpa-onnx onnxmltools==1.13.0
python3 -m pip install "numpy<2" soundfile

- name: Generate script
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
shell: bash
run: |
cd scripts/piper

total=${{ matrix.total }}
index=${{ matrix.index }}

git config --global user.email "csukuangfj@gmail.com"
git config --global user.name "Fangjun Kuang"

git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf

python3 ./generate.py --total $total --index $index
chmod +x ./generate.sh
ls -lh

- name: Show script
shell: bash
run: |
cd scripts/piper
cat ./generate.sh

- name: Run script
shell: bash
run: |
cd scripts/piper
./generate.sh

- name: Show generated mp3 files
shell: bash
run: |
cd scripts/piper
ls -lh hf/piper/mp3/*
echo "----"
ls -lh hf/piper/mp3/*/*

- name: Push generated mp3 files
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
uses: nick-fields/retry@v3
with:
max_attempts: 20
timeout_seconds: 200
shell: bash
command: |
cd scripts/piper/hf
git pull --rebase
git lfs track "*.mp3"
git status .
git add .
git commit -m 'Add mp3 files'
git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main

- name: Show generated model files
shell: bash
run: |
cd scripts/piper
ls -lh *.tar.bz2

- name: Show generated model files(2)
shell: bash
run: |
cd scripts/piper
ls -lh release/

- name: Publish to huggingface
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
uses: nick-fields/retry@v3
with:
max_attempts: 20
timeout_seconds: 200
shell: bash
command: |
git config --global user.email "csukuangfj@gmail.com"
git config --global user.name "Fangjun Kuang"

export GIT_LFS_SKIP_SMUDGE=1
export GIT_CLONE_PROTECTION_ACTIVE=false

dirs=(
vits-piper-de_DE-glados-high
vits-piper-de_DE-glados-low
vits-piper-de_DE-glados-medium
vits-piper-de_DE-glados_turret-high
vits-piper-de_DE-glados_turret-low
vits-piper-de_DE-glados_turret-medium
vits-piper-en_US-glados-high
)
for d in ${dirs[@]}; do
src=scripts/piper/release/$d
if [ ! -d $src ]; then
continue;
fi

rm -rf huggingface
git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface
cp -a $src/* ./huggingface
pushd huggingface
git lfs track "*.onnx"
git lfs track af_dict
git lfs track ar_dict
git lfs track cmn_dict
git lfs track da_dict en_dict fa_dict hu_dict ia_dict it_dict lb_dict phondata ru_dict ta_dict
git lfs track ur_dict yue_dict

git status
git add .
git status
git commit -m "add models"
git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main
popd

done

- name: Release
if: github.repository_owner == 'csukuangfj'
uses: svenstaro/upload-release-action@v2
with:
file_glob: true
file: ./scripts/piper/vits-piper-*.tar.bz2
overwrite: true
repo_name: k2-fsa/sherpa-onnx
repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}
tag: tts-models

- name: Release
if: github.repository_owner == 'k2-fsa'
uses: svenstaro/upload-release-action@v2
with:
file_glob: true
file: ./scripts/piper/vits-piper-*.tar.bz2
overwrite: true
tag: tts-models
5 changes: 5 additions & 0 deletions scripts/piper/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
*.sh
*.onnx
*.json
MODEL_CARD
generate_samples-vits-piper*.py
117 changes: 117 additions & 0 deletions scripts/piper/add_meta_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
#!/usr/bin/env python3
# Copyright 2025 Xiaomi Corp. (authors: Fangjun Kuang)

import argparse
import json
from typing import Any, Dict

import onnx
from iso639 import Lang


def get_args():
# For en_GB-semaine-medium
# --name semaine
# --kind medium
# --lang en_GB
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter
)
parser.add_argument(
"--name",
type=str,
required=True,
)

parser.add_argument(
"--kind",
type=str,
required=True,
)

parser.add_argument(
"--lang",
type=str,
required=True,
)
return parser.parse_args()


def add_meta_data(filename: str, meta_data: Dict[str, Any]):
"""Add meta data to an ONNX model. It is changed in-place.

Args:
filename:
Filename of the ONNX model to be changed.
meta_data:
Key-value pairs.
"""
model = onnx.load(filename)

while len(model.metadata_props):
model.metadata_props.pop()

for key, value in meta_data.items():
meta = model.metadata_props.add()
meta.key = key
meta.value = str(value)

onnx.save(model, filename)


def load_config(filename):
with open(filename, "r") as file:
config = json.load(file)
return config


def generate_tokens(config):
id_map = config["phoneme_id_map"]
with open("tokens.txt", "w", encoding="utf-8") as f:
for s, i in id_map.items():
f.write(f"{s} {i[0]}\n")
print("Generated tokens.txt")


# for en_US-lessac-medium.onnx
# export LANG=en_US
# export TYPE=lessac
# export NAME=medium
def main():
args = get_args()
print(args)
lang = args.lang

lang_iso = Lang(lang.split("_")[0])
print(lang, lang_iso)

kind = args.kind

name = args.name

# en_GB-alan-low.onnx.json
config = load_config(f"{lang}-{name}-{kind}.onnx.json")

print("generate tokens")
generate_tokens(config)

sample_rate = config["audio"]["sample_rate"]
if sample_rate == 22500:
print("Change sample rate from 22500 to 22050")
sample_rate = 22050

print("add model metadata")
meta_data = {
"model_type": "vits",
"comment": "piper", # must be piper for models from piper
"language": lang_iso.name,
"voice": config["espeak"]["voice"], # e.g., en-us
"has_espeak": 1,
"n_speakers": config["num_speakers"],
"sample_rate": sample_rate,
}
print(meta_data)
add_meta_data(f"{lang}-{name}-{kind}.onnx", meta_data)


main()
74 changes: 74 additions & 0 deletions scripts/piper/dynamic_quantization.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
#!/usr/bin/env python3
# Copyright 2025 Xiaomi Corp. (authors: Fangjun Kuang)

import argparse

import onnxmltools
from onnxmltools.utils.float16_converter import convert_float_to_float16
from onnxruntime.quantization import QuantType, quantize_dynamic


def get_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"--input",
type=str,
required=True,
)
parser.add_argument(
"--output-fp16",
type=str,
required=True,
)

parser.add_argument(
"--output-int8",
type=str,
required=True,
)
return parser.parse_args()


# for op_block_list, see also
# https://github.com/microsoft/onnxruntime/blob/089c52e4522491312e6839af146a276f2351972e/onnxruntime/python/tools/transformers/float16.py#L115
#
# libc++abi: terminating with uncaught exception of type Ort::Exception:
# Type Error: Type (tensor(float16)) of output arg (/dp/RandomNormalLike_output_0)
# of node (/dp/RandomNormalLike) does not match expected type (tensor(float)).
#
# libc++abi: terminating with uncaught exception of type Ort::Exception:
# This is an invalid model. Type Error: Type 'tensor(float16)' of input
# parameter (/enc_p/encoder/attn_layers.0/Constant_84_output_0) of
# operator (Range) in node (/Range_1) is invalid.
def export_onnx_fp16(onnx_fp32_path, onnx_fp16_path):
onnx_fp32_model = onnxmltools.utils.load_model(onnx_fp32_path)
onnx_fp16_model = convert_float_to_float16(
onnx_fp32_model,
keep_io_types=True,
op_block_list=[
"RandomNormalLike",
"Range",
],
)
onnxmltools.utils.save_model(onnx_fp16_model, onnx_fp16_path)


def main():
args = get_args()
print(args)

in_filename = args.input
output_fp16 = args.output_fp16
output_int8 = args.output_int8

quantize_dynamic(
model_input=in_filename,
model_output=output_int8,
weight_type=QuantType.QUInt8,
)

export_onnx_fp16(in_filename, output_fp16)


if __name__ == "__main__":
main()
Loading
Loading