Skip to content

Commit 51cc2ad

Browse files
committed
Update README.md with revised model path and quantization details
Update the example LLaMA model file path to reflect the new naming convention. Clarify support for GGUF format models, specifying full FP16 support and partial support for Q8_0 and Q4_0 quantization.
1 parent 4607886 commit 51cc2ad

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -237,7 +237,7 @@ GPU-accelerated LLaMA.java model runner using TornadoVM
237237

238238
options:
239239
-h, --help show this help message and exit
240-
--model MODEL_PATH Path to the LLaMA model file (e.g., Llama-3.2-1B-Instruct-Q8_0.gguf) (default: None)
240+
--model MODEL_PATH Path to the LLaMA model file (e.g., beehive-llama-3.2-8b-instruct-fp16.gguf) (default: None)
241241

242242
LLaMA Configuration:
243243
--prompt PROMPT Input prompt for the model (default: None)
@@ -379,7 +379,7 @@ llama-tornado --gpu --model beehive-llama-3.2-1b-instruct-fp16.gguf --prompt "te
379379

380380
## Current Features & Roadmap
381381

382-
- **Support for GGUF format models** with Q8_0 and Q4_0 quantization.
382+
- **Support for GGUF format models** with full FP16 and partial support for Q8_0 and Q4_0 quantization.
383383
- **Instruction-following and chat modes** for various use cases.
384384
- **Interactive CLI** with `--interactive` and `--instruct` modes.
385385
- **Flexible backend switching** - choose OpenCL or PTX at runtime (need to build TornadoVM with both enabled).

0 commit comments

Comments
 (0)