Skip to content
Discussion options

You must be logged in to vote

Hi @shimjunho, thank you for your question!

The most recent SpeziLLM version does not support .gguf models. We have updated the module to work with MLX. Therefore, if you want to use the tinyllama-1.1b-chat-v1.0.Q4_0 model, you can either convert it yourself using the MLX convert function, or search (https://huggingface.co/mlx-community?search_models=llama&sort_models=downloads#models) for an existing model on Hugging Face (e.g. https://huggingface.co/pcuenq/tiny-llama-chat-mlx).

I would like to point out a maybe a bit more powerful model: https://huggingface.co/mlx-community/Llama-3.2-1B-Instruct-4bit
In SpeziLLM, you can choose to either set the model in the config to .llama3_2_1B_4bit

Replies: 2 comments 6 replies

Comment options

You must be logged in to vote
6 replies
@shimjunho
Comment options

@shimjunho
Comment options

@PSchmiedmayer
Comment options

@LeonNissen
Comment options

Answer selected by philippzagar
@shimjunho
Comment options

@LeonNissen
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants