-
Notifications
You must be signed in to change notification settings - Fork 18
🎨 Remove new_token_ids from warmup #292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Hi! Thank you for contributing to vLLM support on Spyre.
Or this can be done with
Now you are good to go 🚀 |
dd4e9e8
to
5113a4f
Compare
bot:test |
9ac70cb
to
f8fcd6d
Compare
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
f8fcd6d
to
83059cb
Compare
num_computed_tokens.append(prompt_len) | ||
cached_request_data = CachedRequestData( | ||
req_ids=req_ids, | ||
resumed_from_preemption=False, | ||
new_token_ids=new_token_ids, | ||
new_token_ids=[[] for _ in range(len(dummy_requests))], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be set to anything? I'm actually not clear on whether or not the prefill pass returns a first sampled token which may be cached here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prefill pass does return a sampled token, but that caching happens within the execute_model and hence we don't need to pass in new_token_ids
anymore.
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
@prashantgupta24 do we wanna get this merged? If so we should definitely test to triple-check that this works with the upcoming compiler changes for continuous batching |
No hurry as such, don't want to add anything before the release lol |
Description
🎨 Remove
new_token_ids
from warmup sincenew_token_ids
are not used anymore.