-
Couldn't load subscription status.
- Fork 297
Enhancement: WIP: create_profile no longer computes extrema when we provide override_bins
#5184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
create_profile no longer computes extrema when we provide override_binscreate_profile no longer computes extrema when we provide override_bins
This was a simple cut-and-paste
This can save a lot of time with large simulations
588a53a to
e72c514
Compare
| if extrema is None or not any(collapse(extrema.values())): | ||
| # when extrema are all Nones, we treat them as though extrema was set as None. | ||
| # | ||
| # In the future, it may be better to replace `not any(...)` with either: | ||
| # `all(v is None for v in collapse(extrema.values()))` OR | ||
| # `all(v is None or v == (None, None) for v in collapse(extrema.values))` | ||
| # | ||
| # Be aware that doing this changes behavior, when extrema has the value: | ||
| # `{<bin_field>: (0., None)}` or `{<bin_field>: (None, 0.)}` | ||
| # I happen to think think that the existing behavior is a bug (i.e. the | ||
| # behavior changes if we replaced 0. with **ANY** other number) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: This is mostly for posterity. I don't think we should deal with this issue in this PR (since this PR is focused on not changing observable behavior). I plan to make an issue for this after the PR is reviewed
I think we should probably change the if-statement to:
if extrema is None or all(v is None for v in collapse(extrema.values())):OR
if extrema is None or all(list(v) == [None, None] for v in extrema.values()):As the comments mention, both choices technically change the behavior when one of the extrema is a 0 and the other is a None, but I think the existing behavior is probably a bug. The former option is more backwards compatible while the latter option is more consistent with the rest of our extrema handling.
For more context, the following table whether various choices for the extrema argument are considered valid, when bin_fields is [("gas", "velocity_x"), ("gas", "density")]
| Current Logic | Snippet 1 | Snippet 2 | |
|---|---|---|---|
{("gas","velocity_x"): (0.0, None)} |
️✅* | ❌ | ❌ |
{("gas","velocity_x"): None, ("gas", "density"): None} |
️✅ | ️✅ | ❌ |
{("gas","velocity_x"): (None, None), ("gas", "density"): (None, None)} |
️✅ | ️✅ | ️✅ |
{("gas","velocity_x"): (0.0, None), ("gas", "density"): (None, None)} |
️✅* | ❌ | ❌ |
{("gas","velocity_x"): None, ("gas", "density"): (1e-27, 1e-23)} |
❌ | ❌ | ❌ |
{("gas","velocity_x"): (None,None), ("gas", "density"): (1e-27, 1e-23)} |
️✅ | ️✅ | ️✅ |
Importantly: The starred cases treat are scenarios that the snippets would fix. In those cases, the existing logic totally ignores the fact that the caller specifies that the lower extrema for ("gas","velocity") is 0.0 and acts like it was set to None. Furthermore, the existing logic causes the function to behave completely differently for the following cases:
{("gas","velocity_x"): (-1.0, None)}(this is considered to be invalid){("gas","velocity_x"): (1.0, None), ("gas", "density"): (None, None)}{("gas","velocity_x"): (0.0, None), ("gas", "density"): (None, 1e-20)}.
PR Summary
The current implementation of
create_profilecomputes extrema whenoverride_binsis provided. This can significantly increase the cost of an operation when working with large datasets. This PR addresses that scenario.The PR also refactors argument handling of
create_profileto be more explicit forextremaandoverride_bins. The details of these arguments have a number of undocumented idiosyncracies. While the code is more verbose, I think it makes the requirements a little more explicit (and the error messages are more detailed). In the future, I think we can improve this a lot more, but this seems like a reasonable first-step.It may be easiest to review this PR commit-by-commit.
PR Checklist
The PR Checklist isn't applicable since I haven't changed any user-facing behavior