-
Notifications
You must be signed in to change notification settings - Fork 6.3k
[core] Deflake test_runtime_env_pip_and_conda_4.py
#52750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
reason="Requires PR wheels built in CI, so only run on linux CI machines.", | ||
) | ||
@pytest.mark.parametrize("field", ["pip"]) | ||
def test_pip_ray_is_overwritten(start_cluster, field): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange diff rendering because I took this out of the TestGC
nesting
Changed to use |
Passed in 132s in CI now: https://buildkite.com/ray-project/premerge/builds/39189#01969286-d550-40d7-bce7-c44d37760fc9/184-1792 |
…es/deflake-re-4
@@ -116,7 +115,7 @@ class TestGC: | |||
reason="Needs PR wheels built in CI, so only run on linux CI machines.", | |||
) | |||
@pytest.mark.parametrize("field", ["conda", "pip"]) | |||
@pytest.mark.parametrize("spec_format", ["file", "python_object"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few minor speedups in this file. no need to test GC logic against the file and object behavior and the sleep was unneeded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't you need to test against file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no meaningful difference in the GC implementation across the two conditions
@@ -139,9 +138,6 @@ def f(): | |||
|
|||
# Ensure that the runtime env has been installed. | |||
assert ray.get(f.remote()) | |||
# Sleep some seconds before checking that we didn't GC. Otherwise this | |||
# check may spuriously pass. | |||
time.sleep(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm assuming the sleep existed to make sure some code ran after the get. Otherwise the following assert will always pass if you run directly after, removing the sleep makes the test not test that behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check on the following line doesn't really need to be there at all IMO except as a sanity check for the testing utils themselves. it's asserting that we don't GC runtime_envs for active jobs. note that:
- if we did, many other test cases would fail as this is very basic functionality.
- this is not really a reliable way to test for the behavior. the GC can be arbitrarily delayed so in order to be sure this is checking what we intend, the sleep needs to be arbitrarily long :)
out of an abundance of caution, I updated the PR to perform the check in a more deterministic way: wait for the task to be marked FINISHED
, then perform the check a few times in a loop. this should provide the same level of guarantee without the nondeterminism/delay of the sleep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool makes sense
@@ -116,7 +115,7 @@ class TestGC: | |||
reason="Needs PR wheels built in CI, so only run on linux CI machines.", | |||
) | |||
@pytest.mark.parametrize("field", ["conda", "pip"]) | |||
@pytest.mark.parametrize("spec_format", ["file", "python_object"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't you need to test against file?
Test was [timing out](https://buildkite.com/ray-project/postmerge/builds/9892#019691bf-f352-4fbf-a92c-ff277cf7a901/176-1944) sometimes -- let's make it faster. Updated the slowest test condition to avoid restarting ray each time, which allows the runtime_env cache to be hit and not have to install the env 3 times. Before: ```bash ================= 8 passed, 1 skipped in 96.56s (0:01:36) ================== ``` After: ```bash ================= 8 passed, 1 skipped in 62.14s (0:01:02) ================== ``` --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Test was [timing out](https://buildkite.com/ray-project/postmerge/builds/9892#019691bf-f352-4fbf-a92c-ff277cf7a901/176-1944) sometimes -- let's make it faster. Updated the slowest test condition to avoid restarting ray each time, which allows the runtime_env cache to be hit and not have to install the env 3 times. Before: ```bash ================= 8 passed, 1 skipped in 96.56s (0:01:36) ================== ``` After: ```bash ================= 8 passed, 1 skipped in 62.14s (0:01:02) ================== ``` --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Test was [timing out](https://buildkite.com/ray-project/postmerge/builds/9892#019691bf-f352-4fbf-a92c-ff277cf7a901/176-1944) sometimes -- let's make it faster. Updated the slowest test condition to avoid restarting ray each time, which allows the runtime_env cache to be hit and not have to install the env 3 times. Before: ```bash ================= 8 passed, 1 skipped in 96.56s (0:01:36) ================== ``` After: ```bash ================= 8 passed, 1 skipped in 62.14s (0:01:02) ================== ``` --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: zhaoch23 <c233zhao@uwaterloo.ca>
Test was timing out sometimes -- let's make it faster.
Updated the slowest test condition to avoid restarting ray each time, which allows the runtime_env cache to be hit and not have to install the env 3 times.
Before:
================= 8 passed, 1 skipped in 96.56s (0:01:36) ==================
After:
================= 8 passed, 1 skipped in 62.14s (0:01:02) ==================