how are we handling the local model loading process for .gguf files in iOS #83
-
What Stanford Spezi module is your challenge related to?Spezi DescriptionHello developer, I hope you are doing well. I found your repository at StanfordSpezi/SpeziLLM and have been following the instructions for using SpeziLLM to load local LLM models in an iOS application. However, I keep encountering an error: SpeziLLMLocal: Local LLM file could not be opened, indicating that the model file doesn't exist This happens even though the model file appears to be recognized correctly (its file size is reported). The model initially loads without error, but as soon as I attempt to generate a response, I get a “LLM file not found” error at runtime. Could you please share how you are handling the local model loading process for .gguf files in iOS? Do you have any specific path configuration or other steps inside the MLX/SpeziLLMLocal pipeline that I should know about beyond placing the .gguf file in the Documents directory and setting up LLMLocalSchema? Thank you in advance for any guidance. I appreciate your time! Best regards, korean student(01H-W-H10) Reproduction
Expected behavior
Additional context
Code of Conduct
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
@shimjunho Thank you for reaching out! Can you please follow the following steps to help us help you:
|
Beta Was this translation helpful? Give feedback.
-
Hello! I am same team with shimjunho. |
Beta Was this translation helpful? Give feedback.
Hi @shimjunho, thank you for your question!
The most recent SpeziLLM version does not support
.gguf
models. We have updated the module to work with MLX. Therefore, if you want to use thetinyllama-1.1b-chat-v1.0.Q4_0
model, you can either convert it yourself using the MLX convert function, or search (https://huggingface.co/mlx-community?search_models=llama&sort_models=downloads#models) for an existing model on Hugging Face (e.g. https://huggingface.co/pcuenq/tiny-llama-chat-mlx).I would like to point out a maybe a bit more powerful model: https://huggingface.co/mlx-community/Llama-3.2-1B-Instruct-4bit
In SpeziLLM, you can choose to either set the model in the config to
.llama3_2_1B_4bit
…