This release contains test updates and fixes for continuous batching, and a small logging improvement
What's Changed
- make truncation of token lists optional in example script by @maxdebayser in #317
- [Fix][Tests] TP param used in tests unconditionally by @rafvasq in #315
- Print compile cache enablement along with warmup time by @sducouedic in #321
- ✅ add assertions for warmup mode context by @prashantgupta24 in #294
- fix off by one error by @maxdebayser in #324
- 🐛 fix cb online test by @joerunde in #326
- [CB] Update CB docs + Refactoring scheduling step-by-step inference tests by @sducouedic in #323
Full Changelog: v0.5.2...v0.5.3