-
Notifications
You must be signed in to change notification settings - Fork 31
[TPU][Test] Divide the test to 2 parts and reduce timeout. #128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
Can you link a build with both test jobs for both parts running here? Here is how to test it against your branch on vLLM https://github.com/vllm-project/ci-infra?tab=readme-ov-file#how-to-test-changes-in-this-repo |
|
Sorry I mean against this branch vllm-project/vllm#21431. In fact, we should not trigger anything on main. |
Sorry! My bad. I didn't realize I can do this on my fork branch. Thank you for pointing out! |
Enlisted to latest and created: https://buildkite.com/vllm/ci/builds/24861 |
@QiliangCui Can you check if this test failing is related to the split? If not, I can merge the PR: https://buildkite.com/vllm/ci/builds/24871#01983d73-229c-40e7-b66e-7bb411c2fc6e |
Thank you Kevin. it is not related. it is safe to merge. |
created a new build to verify with latest code. https://buildkite.com/vllm/ci/builds/24997 |
Hi @khluu , let's merge both this PR and vllm-project/vllm#21431? |
The existing TPU V1 Test usually takes 1 hour and 50 minutes but the timeout setting is 300 minutes(5 hours). If the test stuck, the waiting time is very long.
I want to divide it to 2 tests and give each of them shorter timeout.
TODO, after merging this and the vllm branch PR, set the part1 timeout to 90 minutes.