-
I want to use llamafile/llamafiler for FIM completion in vscode and it looks like using llama.cpp features is the best option. The documentation points to a file with llama.cpp examples, but it's too abstract and doesn't really help with running via commandline. The command I'm trying to translate from llama.cpp is: llama-server -m qwen2.5-coder-3b-q8_0.gguf --port 8080 -ngl 99 -fa -ub 1024 -b 1024 -dt 0.1 --ctx-size 0 --cache-reuse 256 The furthest I can get without "command not found" errors is: ./llamafile -m qwen2.5-coder-3b-q8_0.gguf --server -ngl 99 -fa -b 1024 --ctx-size 0 After hours of reading, I have no clue what the equivalents are for Could anyone offer some guidance or advice? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Llamafile is quite behind in terms of the upstream features. They are likely not supported at this time. |
Beta Was this translation helpful? Give feedback.
Llamafile is quite behind in terms of the upstream features. They are likely not supported at this time.