You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/server/README.md
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -39,6 +39,7 @@ see https://github.com/ggerganov/llama.cpp/issues/1437
39
39
-`--mmproj MMPROJ_FILE`: Path to a multimodal projector file for LLaVA.
40
40
-`--grp-attn-n`: Set the group attention factor to extend context size through self-extend(default: 1=disabled), used together with group attention width `--grp-attn-w`
41
41
-`--grp-attn-w`: Set the group attention width to extend context size through self-extend(default: 512), used together with group attention factor `--grp-attn-n`
42
+
-`-n, --n-predict`: Set the maximum tokens to predict (default: -1)
printf(" -gan N, --grp-attn-n N set the group attention factor to extend context size through self-extend(default: 1=disabled), used together with group attention width `--grp-attn-w`");
1922
1935
printf(" -gaw N, --grp-attn-w N set the group attention width to extend context size through self-extend(default: 512), used together with group attention factor `--grp-attn-n`");
1923
1936
printf(" --chat-template FORMAT_NAME");
1924
-
printf(" set chat template, possible valus is: llama2, chatml (default %s)", sparams.chat_template.c_str());
1937
+
printf(" set chat template, possible value is: llama2, chatml (default %s)", sparams.chat_template.c_str());
0 commit comments