Skip to content

Support top level response caching for ensemble models #338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 46 commits into from
May 9, 2024

Conversation

lkomali
Copy link
Contributor

@lkomali lkomali commented Apr 4, 2024

ref slack thread: https://nvidia.slack.com/archives/CAZKCU4UV/p1677717244222069

Currently caching at the top-level request sent to ensemble scheduler is not supported.
Implemented caching top level requests for ensemble models.
In case of cache hit, the response is sent without executing composing models.
In case of cache miss, the ensemble pipeline runs as is.
Changed Logic for computing Cache Miss Latency: Cache Lookup time + Ensemble Pipeline time + Cache Insertion time
Moved similar logic form ensemble_scheduler.cc and dynamic_batch_scheduler.cc to scheduler_utils.cc to reduce code redundancy.

@lkomali lkomali requested a review from rmccorm4 April 4, 2024 05:32
@lkomali lkomali changed the title Support top level request caching for ensemble models Support top level response caching for ensemble models Apr 6, 2024
@lkomali lkomali marked this pull request as draft April 23, 2024 23:24
@lkomali lkomali marked this pull request as ready for review April 25, 2024 08:31
@lkomali lkomali requested a review from rmccorm4 April 25, 2024 08:33
@rmccorm4 rmccorm4 requested a review from oandreeva-nv April 29, 2024 18:06
Copy link
Contributor

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, Harshini! 🚀

@rmccorm4
Copy link
Contributor

rmccorm4 commented May 6, 2024

FYI don't merge this until the testing PR is also ready to merge and pipelines look good

Copy link
Member

@Tabrizian Tabrizian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really clean code, thanks @lkomali !

Copy link
Member

@Tabrizian Tabrizian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@rmccorm4 rmccorm4 merged commit 47f3f4e into main May 9, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants