-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Self Checks
- This template is only for bug reports. For questions, please visit Discussions.
- I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
- I have searched for existing issues, including closed ones. Search issues
- I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- Please do not modify this template and fill in all required fields.
Cloud or Self Hosted
Self Hosted (Docker)
Environment Details
absl-py==2.3.1
aiofiles==24.1.0
aiohappyeyeballs==2.6.1
aiohttp==3.12.15
aiosignal==1.4.0
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
anyio==4.10.0
argbind==0.3.9
asttokens==3.0.0
attrs==25.3.0
audioread==3.0.1
baize==0.23.1
Brotli==1.1.0
cachetools==6.2.0
certifi==2025.8.3
cffi==1.17.1
charset-normalizer==3.4.3
click==8.2.1
coloredlogs==15.0.1
contourpy==1.3.3
cycler==0.12.1
datasets==2.18.0
decorator==5.2.1
descript-audio-codec==1.0.0
descript-audiotools==0.7.2
dill==0.3.8
docstring_parser==0.17.0
einops==0.8.1
einx==0.2.2
executing==2.2.1
fastapi==0.116.1
ffmpy==0.6.1
filelock==3.19.1
fire==0.7.1
-e git+https://github.com/fishaudio/fish-speech.git@5a89fe56cbfdac516c87e82f361770d5240e3aa6#egg=fish_speech
flatbuffers==25.2.10
flatten-dict==0.4.2
fonttools==4.59.2
frozendict==2.4.6
frozenlist==1.7.0
fsspec==2024.2.0
future==1.0.0
gitdb==4.0.12
GitPython==3.1.45
gradio==5.44.1
gradio_client==1.12.1
groovy==0.1.2
grpcio==1.74.0
h11==0.16.0
hf-xet==1.1.9
httpcore==1.0.9
httpx==0.28.1
huggingface-hub==0.34.4
humanfriendly==10.0
hydra-core==1.3.2
idna==3.10
importlib_resources==6.5.2
ipython==9.5.0
ipython_pygments_lexers==1.1.1
jedi==0.19.2
Jinja2==3.1.6
joblib==1.5.2
julius==0.2.7
kiwisolver==1.4.9
kui==1.13.0
lazy_loader==0.4
librosa==0.11.0
lightning==2.5.5
lightning-utilities==0.15.2
llvmlite==0.44.0
loguru==0.7.3
loralib==0.1.2
Markdown==3.9
markdown-it-py==4.0.0
markdown2==2.5.4
MarkupSafe==3.0.2
matplotlib==3.10.6
matplotlib-inline==0.1.7
mdurl==0.1.2
modelscope==1.17.1
mpmath==1.3.0
msgpack==1.1.1
multidict==6.6.4
multiprocess==0.70.16
natsort==8.4.0
networkx==3.5
numba==0.61.2
numpy==1.26.4
nvidia-cublas-cu12==12.8.4.1
nvidia-cuda-cupti-cu12==12.8.90
nvidia-cuda-nvrtc-cu12==12.8.93
nvidia-cuda-runtime-cu12==12.8.90
nvidia-cudnn-cu12==9.10.2.21
nvidia-cufft-cu12==11.3.3.83
nvidia-cufile-cu12==1.13.1.3
nvidia-curand-cu12==10.3.9.90
nvidia-cusolver-cu12==11.7.3.90
nvidia-cusparse-cu12==12.5.8.93
nvidia-cusparselt-cu12==0.7.1
nvidia-nccl-cu12==2.27.3
nvidia-nvjitlink-cu12==12.8.93
nvidia-nvtx-cu12==12.8.90
omegaconf==2.3.0
onnxruntime==1.22.1
opencc-python-reimplemented==0.1.7
orjson==3.11.3
ormsgpack==1.10.0
packaging==25.0
pandas==2.3.2
parso==0.8.5
pexpect==4.9.0
pillow==11.3.0
platformdirs==4.4.0
pooch==1.8.2
prompt_toolkit==3.0.52
propcache==0.3.2
protobuf==3.19.6
ptyprocess==0.7.0
pure_eval==0.2.3
pyarrow==21.0.0
pyarrow-hotfix==0.7
PyAudio==0.2.14
pycparser==2.22
pydantic==2.9.2
pydantic_core==2.23.4
pydub==0.25.1
Pygments==2.19.2
pyloudnorm==0.1.1
pyparsing==3.2.3
pyrootutils==1.0.4
pystoi==0.4.1
python-dateutil==2.9.0.post0
python-dotenv==1.1.1
python-multipart==0.0.20
pytorch-lightning==2.5.5
pytz==2025.2
PyYAML==6.0.2
randomname==0.2.1
regex==2025.9.1
requests==2.32.5
resampy==0.4.3
rich==14.1.0
ruff==0.12.12
safehttpx==0.1.6
safetensors==0.6.2
scikit-learn==1.7.1
scipy==1.16.1
semantic-version==2.10.0
sentry-sdk==2.37.0
setuptools==80.9.0
shellingham==1.5.4
silero-vad==6.0.0
six==1.17.0
smmap==5.0.2
sniffio==1.3.1
soundfile==0.13.1
soxr==1.0.0
stack-data==0.6.3
starlette==0.47.3
sympy==1.14.0
tensorboard==2.20.0
tensorboard-data-server==0.7.2
termcolor==3.1.0
threadpoolctl==3.6.0
tiktoken==0.11.0
tokenizers==0.22.0
tomlkit==0.13.3
torch==2.8.0
torch-stoi==0.2.3
torchaudio==2.8.0
torchmetrics==1.8.2
tqdm==4.67.1
traitlets==5.14.3
transformers==4.56.1
triton==3.4.0
typer==0.17.4
typing_extensions==4.15.0
tzdata==2025.2
urllib3==2.5.0
uvicorn==0.35.0
wandb==0.21.3
wcwidth==0.2.13
websockets==15.0.1
Werkzeug==3.1.3
wheel==0.45.1
xxhash==3.5.0
yarl==1.20.1
zstandard==0.24.0
Steps to Reproduce
1.python fish_speech/models/dac/inference.py -i "data/tiaoxuan/common_voice_zh-CN_32186218.mp3" --checkpoint-path "checkpoints/openaudio-s1-mini/codec.pth"
2.python fish_speech/models/text2semantic/inference.py --text "证书N N N N将过期" --prompt-text "老普林尼效仿了斯多葛学派哲学家塞内卡的写作风格。" --prompt-tokens "fake.npy" --half
3.python fish_speech/models/dac/inference.py -i "temp/codes_0.npy"
✔️ Expected Behavior
"证书N N N将过期" in the generated speech.