You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cli : auto activate conversation mode if chat template is available (ggml-org#11214)
* cli : auto activate conversation mode if chat template is detected
* add warn on bad template
* update readme (writing with the help of chatgpt)
* update readme (2)
* do not activate -cnv for non-instruct models
You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from Hugging Face by using this CLI argument: `-hf <user>/<model>[:quant]`
283
+
282
284
After downloading a model, use the CLI tools to run it locally - see below.
283
285
284
286
`llama.cpp` requires the model to be stored in the [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) file format. Models in other data formats can be converted to GGUF using the `convert_*.py` Python scripts in this repo.
@@ -297,21 +299,12 @@ To learn more about model quantization, [read this documentation](examples/quant
297
299
#### A CLI tool for accessing and experimenting with most of `llama.cpp`'s functionality.
298
300
299
301
- <detailsopen>
300
-
<summary>Run simple text completion</summary>
301
-
302
-
```bash
303
-
llama-cli -m model.gguf -p "I believe the meaning of life is" -n 128
304
-
305
-
# I believe the meaning of life is to find your own truth and to live in accordance with it. For me, this means being true to myself and following my passions, even if they don't align with societal expectations. I think that's what I love about yoga – it's not just a physical practice, but a spiritual one too. It's about connecting with yourself, listening to your inner voice, and honoring your own unique journey.
306
-
```
307
-
308
-
</details>
309
-
310
-
- <details>
311
302
<summary>Run in conversation mode</summary>
312
303
304
+
Models with a built-in chat template will automatically activate conversation mode. If this doesn't occur, you can manually enable it by adding `-cnv` and specifying a suitable chat template with `--chat-template NAME`
305
+
313
306
```bash
314
-
llama-cli -m model.gguf -p "You are a helpful assistant" -cnv
307
+
llama-cli -m model.gguf
315
308
316
309
# > hi, who are you?
317
310
# Hi there! I'm your helpful assistant! I'm an AI-powered chatbot designed to assist and provide information to users like you. I'm here to help answer your questions, provide guidance, and offer support on a wide range of topics. I'm a friendly and knowledgeable AI, and I'm always happy to help with anything you need. What's on your mind, and how can I assist you today?
@@ -323,17 +316,28 @@ To learn more about model quantization, [read this documentation](examples/quant
323
316
</details>
324
317
325
318
- <details>
326
-
<summary>Run with custom chat template</summary>
319
+
<summary>Run in conversation mode with custom chat template</summary>
327
320
328
321
```bash
329
-
# use the "chatml" template
330
-
llama-cli -m model.gguf -p "You are a helpful assistant" -cnv --chat-template chatml
322
+
# use the "chatml" template (use -h to see the list of supported templates)
To disable conversation mode explicitly, use `-no-cnv`
335
+
336
+
```bash
337
+
llama-cli -m model.gguf -p "I believe the meaning of life is" -n 128 -no-cnv
338
+
339
+
# I believe the meaning of life is to find your own truth and to live in accordance with it. For me, this means being true to myself and following my passions, even if they don't align with societal expectations. I think that's what I love about yoga – it's not just a physical practice, but a spiritual one too. It's about connecting with yourself, listening to your inner voice, and honoring your own unique journey.
0 commit comments