-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Self Checks
- This template is only for bug reports. For questions, please visit Discussions.
- I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
- I have searched for existing issues, including closed ones. Search issues
- I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- Please do not modify this template and fill in all required fields.
Cloud or Self Hosted
Self Hosted (Docker)
Environment Details
I'm trying to launch Docker compose, following installation, described here: https://speech.fish.audio/install/.
I downloaded model via git clone from here https://huggingface.co/fishaudio/openaudio-s1-mini.
My OS is Ubuntu 22.04.
But i'm getting an error, that looks like a problem with model from HF.
Steps to Reproduce
- Download fish-speech using git clone https://github.com/fishaudio/fish-speech.git
- Download model from https://huggingface.co/fishaudio/openaudio-s1-mini to checkpoints folder.
- Create .env with following settings:
BACKEND=cuda # or cpu
COMPILE=1 # Enable compile optimization
GRADIO_PORT=7860 # WebUI port
API_PORT=8080 # API server port
UV_VERSION=0.8.15 # UV package manager version
- Run docker compose --profile webui up
✔️ Expected Behavior
Both webui and api server work as intended
❌ Actual Behavior
webui-1 | [2025-10-24 10:59:16] Starting Fish Speech WebUI...
webui-1 | [2025-10-24 10:59:16] Device args: none
webui-1 | [2025-10-24 10:59:16] Compile args: --compile
webui-1 | [2025-10-24 10:59:16] Server: 0.0.0.0:7860
/app/.venv/lib/python3.12/site-packages/audiotools/core/audio_signal.py:32: SyntaxWarning: invalid escape sequence '\_'
webui-1 | """
webui-1 | /app/.venv/lib/python3.12/site-packages/audiotools/core/audio_signal.py:1012: SyntaxWarning: invalid escape sequence '\_'
webui-1 | """Wrapper around scipy.signal.get_window so one can also get the
webui-1 | /app/.venv/lib/python3.12/site-packages/audiotools/core/audio_signal.py:1092: SyntaxWarning: invalid escape sequence '\_'
webui-1 | """Compute how the STFT should be padded, based on match\_stride.
webui-1 | /app/.venv/lib/python3.12/site-packages/audiotools/core/audio_signal.py:1131: SyntaxWarning: invalid escape sequence '\_'
webui-1 | """Computes the short-time Fourier transform of the audio data,
webui-1 | /app/.venv/lib/python3.12/site-packages/audiotools/core/audio_signal.py:1222: SyntaxWarning: invalid escape sequence '\_'
webui-1 | """Computes inverse STFT and sets it to audio\_data.
webui-1 | 2025-10-24 10:59:35.472 | INFO | __main__:<module>:59 - Loading Llama model...
webui-1 | 2025-10-24 10:59:35.683 | INFO | fish_speech.models.text2semantic.llama:from_pretrained:432 - Loading model from checkpoints/openaudio-s1-mini, config: DualARModelArgs(model_type='dual_ar', vocab_size=155776, n_layer=28, n_head=16, dim=1024, intermediate_size=3072, n_local_heads=8, head_dim=128, rope_base=1000000, norm_eps=1e-06, max_seq_len=8192, dropout=0.0, tie_word_embeddings=False, attention_qkv_bias=False, attention_o_bias=False, attention_qk_norm=True, codebook_size=4096, num_codebooks=10, use_gradient_checkpointing=True, initializer_range=0.03125, is_reward_model=False, scale_codebook_embeddings=True, n_fast_layer=4, fast_dim=1024, fast_n_head=16, fast_n_local_heads=8, fast_head_dim=64, fast_intermediate_size=3072, fast_attention_qkv_bias=False, fast_attention_qk_norm=False, fast_attention_o_bias=False)
webui-1 | Exception in thread Thread-2 (worker):
webui-1 | Traceback (most recent call last):
webui-1 | File "/usr/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
webui-1 | self.run()
webui-1 | File "/usr/lib/python3.12/threading.py", line 1010, in run
webui-1 | self._target(*self._args, **self._kwargs)
webui-1 | File "/app/fish_speech/models/text2semantic/inference.py", line 540, in worker
webui-1 | model, decode_one_token = init_model(
webui-1 | ^^^^^^^^^^^
webui-1 | File "/app/fish_speech/models/text2semantic/inference.py", line 354, in init_model
webui-1 | model = DualARTransformer.from_pretrained(checkpoint_path, load_weights=True)
webui-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webui-1 | File "/app/fish_speech/models/text2semantic/llama.py", line 456, in from_pretrained
webui-1 | weights = torch.load(
webui-1 | ^^^^^^^^^^^
webui-1 | File "/app/.venv/lib/python3.12/site-packages/torch/serialization.py", line 1539, in load
webui-1 | raise RuntimeError(
webui-1 | RuntimeError: mmap can only be used with files saved with `torch.save(_use_new_zipfile_serialization=True), please torch.save your checkpoint with this option in order to use mmap.```
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working