You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to understand prompt caching and encountered an issue using the example from simple.cpp. I set the temperature to 0 to ensure fixed output and restricted the output to n_predict = 32.
When I repeatedly asked, "Where is Africa?" the output remained the same due to the temperature. After each decoding, I saved the state using llama_state_save_file(). However, I noticed that the output file size keeps increasing by 2-3 MB after each decode. If i increase n_predict=256, the file size increment is even larger, ~26MB.
Given that both the prompt and the decoded output are identical, shouldn’t the state remain consistent and so is the output file size?
I would appreciate any help to understand this better.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to understand prompt caching and encountered an issue using the example from simple.cpp. I set the temperature to 0 to ensure fixed output and restricted the output to n_predict = 32.
When I repeatedly asked, "Where is Africa?" the output remained the same due to the temperature. After each decoding, I saved the state using
llama_state_save_file()
. However, I noticed that the output file size keeps increasing by 2-3 MB after each decode. If i increase n_predict=256, the file size increment is even larger, ~26MB.Given that both the prompt and the decoded output are identical, shouldn’t the state remain consistent and so is the output file size?
I would appreciate any help to understand this better.
Beta Was this translation helpful? Give feedback.
All reactions