Skip to content

DakeQQ/Text-to-Speech-TTS-ONNX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text-to-Speech-TTS-ONNX

Utilizes ONNX Runtime for TTS model.

Features

  1. Supported Models:

  2. End-to-End Processing:

    • The solution includes internal STFT/ISTFT processing.
    • Input: reference audio + text
    • Output: generated speech
  3. Optimize:

    • The key components enable 100% deployment of GPU operators.
  4. Resources:


性能 Performance

OS Device Backend Model Time Cost in Seconds
(reference audio: 6s / generates approximately 15 words of speech)
Ubuntu-24.04 Laptop CPU
i7-1165G7
F5-TTS
F32
180
(NFE=32)
Ubuntu-24.04 Laptop GPU
MX150
F5-TTS
F32
62
(NFE=32)
Ubuntu-24.04 Laptop CPU
i7-1165G7
IndexTTS
F32
18
Ubuntu-24.04 Laptop GPU
MX150
BigVGAN V2 24khz_100band_256x
F16
4.6
input mel = (1, 100, 512)

To-Do List


Text-to-Speech-TTS-ONNX

通过 ONNX Runtime 实现运行 TTS 模型。

功能

  1. 支持的模型

  2. 端到端处理

    • 解决方案内置 STFT/ISTFT 处理。
    • 输入:参考音频 + 文本
    • 输出:生成的语音
  3. 优化:

    • 模型关键组件实现了 100% GPU 算子部署。
  4. 资源


About

Utilizes ONNX Runtime for TTS model.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published