-
-
Notifications
You must be signed in to change notification settings - Fork 30
Benchmarks for timeseries aggregate_downsample()
#135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarks for timeseries aggregate_downsample()
#135
Conversation
d3d6a68
to
78d6186
Compare
benchmarks/timeseries.py
Outdated
|
||
# FIXME: it hits the known issue in astropy/utils/masked/core.py | ||
# NotImplementedError: masked instances cannot yet deal with 'reduceat' or 'at'. | ||
# relevant PR: https://github.com/astropy/astropy/pull/17875 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an open issue we can link this to? Linking to a closed astropy/astropy#17875 won't help with visibility. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if there is an issue explicitly tracking the problem, though I suspect astropy/astropy#17875 was prematurely closed: @mvhk mentioned he'd like to have the issue addressed by the PR resolved post v7.1.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it to the list at astropy/astropy#11539 so that we don't forget.
My sense here is to actually remove the MaskedQuantity
from the table, i.e., make two tables in setup
and then allow passing on a table to _do_time_aggretate...
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify on what you mean? Do you mean splitting up the current the test table, with a mix of plain Column
, MaskedColumn
, Quantity
, MaskedQuantity
columns, to 4 test tables (one for each column type), thus making the test more fine-grained?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I had in mind was simpler, to make a self.ts2
that omits the failing column and using that for this test. Possibly, yours is the better idea, though, so do what you think is best! Really, all I hoped was to actually do some benchmark also with np.add.reduceat
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good! Some very small comments.
benchmarks/timeseries.py
Outdated
ts["a"] = np.random.random(num_samples) # plain Column | ||
|
||
ts["a_mc"] = MaskedColumn(ts["a"].value, mask=False) | ||
ts["a_mc"][1] = True # just to ensure some value is masked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be set to np.ma.masked
to ensure a mask? RIght now, I think it just sets the value to 1...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake. I do mean to have the value masked instead of just setting the value.
benchmarks/timeseries.py
Outdated
ts["a_q"] = ts["a"] * u.dimensionless_unscaled # Quantity | ||
|
||
ts["a_mq"] = Masked(ts["a_q"]) # MaskedQuantity | ||
ts["a_mq"][1] = True # just to ensure some value is masked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above.
benchmarks/timeseries.py
Outdated
|
||
# FIXME: it hits the known issue in astropy/utils/masked/core.py | ||
# NotImplementedError: masked instances cannot yet deal with 'reduceat' or 'at'. | ||
# relevant PR: https://github.com/astropy/astropy/pull/17875 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it to the list at astropy/astropy#11539 so that we don't forget.
My sense here is to actually remove the MaskedQuantity
from the table, i.e., make two tables in setup
and then allow passing on a table to _do_time_aggretate...
.
… each column type - aggregate_func combination
0a96b67
to
326d4b7
Compare
The issues raised by mvhk have been resolved. The test is now more granular, one for each column type - With PR 18188:
-- aggregate_func
---------- --------------------------------------------------
col_type None <function nanmean> <ufunc 'add'>
========== ============= ==================== ===============
col 7.21±0.3ms 24.5±0.4ms 7.03±4ms
mcol 7.28±0.08ms 69.4±3ms 7.32±0.6ms
qty 6.97±0.6ms 11.8±0.3ms 6.89±0.04ms
mqty 7.35±0.07ms 28.2±0.2ms n/a
========== ============= ==================== ===============
Before the PR (v7.1.0-like)
-- aggregate_func
---------- -------------------------------------------------
col_type None <function nanmean> <ufunc 'add'>
========== ============ ==================== ===============
col 7.00±0.2ms 24.6±0.4ms 6.95±0.1ms
mcol 7.23±2ms 69.8±10ms 7.31±0.1ms
qty 6.99±0.1ms 31.5±8ms 6.89±0.1ms
mqty 7.25±0.1ms 59.3±9ms n/a
========== ============ ==================== ===============
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super, this looks great!
Thanks all! |
Would have caught the performance regression in v7.1.0 ( astropy/astropy#18176 ), being fixed by PR astropy/astropy#18188 .
The benchmark results on my machine
aggregate_func
None
np.nanmean
@mvhk @pllim