-
Notifications
You must be signed in to change notification settings - Fork 3
test #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test #7
Conversation
- Avoid copying message bytes (can unpickle directly from the zmq frame) - Split proxy into simpler concurrent inbound and outbound tasks - Use pickle instead of cloudpickle for response messages - Use tuples instead of lists (cpython reuses these)
Like benchmark_throughput but using AsyncLLMEngine rather than LLM
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge). To run full CI, you can do one of these:
🚀 |
* updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * cleanup Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
* [Update] LMcache connector v1 implementation Signed-off-by: ApostaC <yihua98@uchicago.edu> * [Add] examples for disaggregated prefill Signed-off-by: ApostaC <yihua98@uchicago.edu> * [add] extra information about evns Signed-off-by: ApostaC <yihua98@uchicago.edu> * Initial stubs for P/D scheduling changes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Updates Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Rs branch (#3) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Rs branch (#5) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Remove Unneeded Arguments (#7) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * cleanup Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Improve disagg-example.sh (#8) - fix spelling - CUDA_VISIBLE_DEVICES should be set externally Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * added connector Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * update Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * remove Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * seems to load properly Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Revert "updated" This reverts commit 97316d9. * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * added Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * diffs for local dev on macos Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * update Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updaed Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Checkpoint. Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * WIP Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated on scheduler side Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Hacking away Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * cleanup Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * ensure request removed from running list Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Runs E2E. Garbage output. Crashes on 2nd request Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * rename files Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * update Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Second request no longer crashes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Remove gpu_model_runner hacks Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Clean up Justfile Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * justfile edits Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Fixes - lm_eval gsm8k has correctness Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * "just delete the assert" Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * fixup precommit issues Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Fixes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated (#12) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Add Accuracy Test (#13) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Preemption Bugfixes (#15) * stash fixed double free issue Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * fixed issue Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated (#16) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Fix Bad Merge | Fix Memory Leak in Upstream (#18) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * fix merge Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * clean up justfile, examples Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * more cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * more cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * more cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * more cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * More cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * more cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * more cleanup, precommit fixes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * More cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * run_accuracy_test.sh UX Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * squash warnings Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * pre-commit Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Add get_finished to base kv connector Signed-off-by: mgoin <mgoin64@gmail.com> * revert test.txt Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * review comments Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> --------- Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: ApostaC <yihua98@uchicago.edu> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Co-authored-by: Robert Shaw <rshaw@neuralmagic.com> Co-authored-by: mgoin <mgoin64@gmail.com> Co-authored-by: mgoin <mgoin64@gmail.com>
* [Update] LMcache connector v1 implementation Signed-off-by: ApostaC <yihua98@uchicago.edu> * [Add] examples for disaggregated prefill Signed-off-by: ApostaC <yihua98@uchicago.edu> * [add] extra information about evns Signed-off-by: ApostaC <yihua98@uchicago.edu> * Initial stubs for P/D scheduling changes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Updates Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Rs branch (#3) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Rs branch (#5) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Remove Unneeded Arguments (#7) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * cleanup Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Improve disagg-example.sh (#8) - fix spelling - CUDA_VISIBLE_DEVICES should be set externally Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * added connector Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * update Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * remove Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * seems to load properly Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Revert "updated" This reverts commit 97316d9. * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * added Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * diffs for local dev on macos Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * update Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updaed Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Checkpoint. Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * WIP Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated on scheduler side Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Hacking away Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * cleanup Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * ensure request removed from running list Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Runs E2E. Garbage output. Crashes on 2nd request Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * rename files Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * update Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Second request no longer crashes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Remove gpu_model_runner hacks Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Clean up Justfile Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * justfile edits Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Fixes - lm_eval gsm8k has correctness Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * "just delete the assert" Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * fixup precommit issues Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Fixes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated (#12) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Add Accuracy Test (#13) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Preemption Bugfixes (#15) * stash fixed double free issue Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * fixed issue Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated (#16) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Fix Bad Merge | Fix Memory Leak in Upstream (#18) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * fix merge Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * cleanup code Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * cleanup code Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatted Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * revert Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * more spurious changes Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Update vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py Co-authored-by: Tyler Michael Smith <tysmith@redhat.com> * Update vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py Co-authored-by: Tyler Michael Smith <tysmith@redhat.com> --------- Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> Co-authored-by: ApostaC <yihua98@uchicago.edu> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tysmith@redhat.com> Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
* [Update] LMcache connector v1 implementation Signed-off-by: ApostaC <yihua98@uchicago.edu> * [Add] examples for disaggregated prefill Signed-off-by: ApostaC <yihua98@uchicago.edu> * [add] extra information about evns Signed-off-by: ApostaC <yihua98@uchicago.edu> * Initial stubs for P/D scheduling changes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Updates Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Rs branch (#3) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Rs branch (#5) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Remove Unneeded Arguments (#7) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * cleanup Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Improve disagg-example.sh (#8) - fix spelling - CUDA_VISIBLE_DEVICES should be set externally Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * added connector Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * update Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * remove Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * seems to load properly Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Revert "updated" This reverts commit 97316d9. * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * added Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * diffs for local dev on macos Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * update Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updaed Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Checkpoint. Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Cleanup Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * WIP Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated on scheduler side Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Hacking away Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * cleanup Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * ensure request removed from running list Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Runs E2E. Garbage output. Crashes on 2nd request Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * rename files Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * updated Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * update Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> * Second request no longer crashes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Remove gpu_model_runner hacks Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Clean up Justfile Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * justfile edits Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Update Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Fixes - lm_eval gsm8k has correctness Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * "just delete the assert" Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * fixup precommit issues Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Fixes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated (#12) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Add Accuracy Test (#13) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Preemption Bugfixes (#15) * stash fixed double free issue Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * fixed issue Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatrd Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated (#16) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Fix Bad Merge | Fix Memory Leak in Upstream (#18) * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * fix merge Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * cleanup code Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * cleanup code Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * stash Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updatted Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * revert Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * more spurious changes Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * Support MLA in NIXL connector Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * WIP adding tests Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * wip Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * Fixes Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> --------- Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Signed-off-by: Robert Shaw <rshaw@neuralmagic.com> Co-authored-by: ApostaC <yihua98@uchicago.edu> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
FILL IN THE PR DESCRIPTION HERE
FIX #xxxx (link existing issues this PR will resolve)
BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE
PR Checklist (Click to Expand)
Thank you for your contribution to vLLM! Before submitting the pull request, please ensure the PR meets the following criteria. This helps vLLM maintain the code quality and improve the efficiency of the review process.
PR Title and Classification
Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:
[Bugfix]
for bug fixes.[CI/Build]
for build or continuous integration improvements.[Doc]
for documentation fixes and improvements.[Model]
for adding a new model or improving an existing model. Model name should appear in the title.[Frontend]
For changes on the vLLM frontend (e.g., OpenAI API server,LLM
class, etc.)[Kernel]
for changes affecting CUDA kernels or other compute kernels.[Core]
for changes in the core vLLM logic (e.g.,LLMEngine
,AsyncLLMEngine
,Scheduler
, etc.)[Hardware][Vendor]
for hardware-specific changes. Vendor name should appear in the prefix (e.g.,[Hardware][AMD]
).[Misc]
for PRs that do not fit the above categories. Please use this sparingly.Note: If the PR spans more than one category, please include all relevant prefixes.
Code Quality
The PR need to meet the following code quality standards:
format.sh
to format your code.docs/source/
if the PR modifies the user-facing behaviors of vLLM. It helps vLLM user understand and utilize the new features or changes.Notes for Large Changes
Please keep the changes as concise as possible. For major architectural changes (>500 LOC excluding kernel/data/config/test), we would expect a GitHub issue (RFC) discussing the technical design and justification. Otherwise, we will tag it with
rfc-required
and might not go through the PR.What to Expect for the Reviews
The goal of the vLLM team is to be a transparent reviewing machine. We would like to make the review process transparent and efficient and make sure no contributor feel confused or frustrated. However, the vLLM team is small, so we need to prioritize some PRs over others. Here is what you can expect from the review process:
action-required
label on the PR if there are changes required. The contributor should address the comments and ping the reviewer to re-review the PR.Thank You
Finally, thank you for taking the time to read these guidelines and for your interest in contributing to vLLM. Your contributions make vLLM a great tool for everyone!