-
Notifications
You must be signed in to change notification settings - Fork 151
Drop combine_size
#1008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop combine_size
#1008
Conversation
d55479d
to
4fab441
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone knowledgeable about the broadcasting machinery should review this probably - but I was reading this and figured I might as well add the few comments I had.
add `static_combine_axes` to handle Tuple correctly
1. `axes` should be static. 2. The implementation for broadcast(1) are unified) 3. simplify `broadcast_index`
Co-Authored-By: Thomas Christensen <tchr@mit.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some work with broadcasting before but I'm not quite familiar with certain parts of broadcasting touched in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks fine 👍 .
Could you bump patch version? I'd tag a new release. |
Patch version was bumped in #1001. (master is 1.4.2 at present). I guess there's no need to bump again? |
Yes, right, I didn't notice. |
I think this affects HybridArrays. See: |
This PR tries to perform
combine_size
duringBroadcast.instantiate
:broadcast
,combine_axes
will return aTuple{Vararg{SOneTo}}
, thus the output size has been static.broadcast!
, thedest
'saxes
is marked as static (if possible). If there's no size-1 dim insrc
. The current generated kernal will be called.With this change, we can reduce the compile time for first broadcast by about 1/4: