Skip to content

Need help understanding llama-bench.exe outputs #9124

Answered by slaren
william-efstratis asked this question in Q&A
Discussion options

You must be logged in to vote

The throughput in avg_ts includes the prompt tokens, ie. it's calculated as 200 tokens / 1814609460 ns. If you prefer to count only the generated tokens, you can calculate that yourself, which would give you the 55 t/s.

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@william-efstratis
Comment options

@slaren
Comment options

Answer selected by william-efstratis
@william-efstratis
Comment options

@william-efstratis
Comment options

@slaren
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants