-
I cloned and built llama.cpp in my Ubuntu GPU A100 box. And why I am running this example as below
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I think I figured out, this is the right command, Hope it will be useful to others. |
Beta Was this translation helpful? Give feedback.
I think I figured out, this is the right command,
./llama-parallel --prompt "where is bangalore\nwho is lord krishna" --parallel 2 --cont-batching --sequences 2 -ntg 1024 -npp 1024,1024 -ngl 35
actually we need to give the sequence size equal to the number of prompts that need to be processed.Hope it will be useful to others.