You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
X-link: facebookresearch/FBGEMM#1467
Pull Request resolved: #4396
Some integrations of fbgemm kernels and oss systems like VLLM would be made simpler by the ability to slice preshuffled tensors. Prior to this diff, there were two blockers to doing that:
- Scales were required to be contiguous. This is easily addressed by more carefully setting the stride argument.
- Shuffled tensors have a non-trivial layout. We add a python helper function for slicing int4 shuffled tensors. Notably, it involves some data copying that I believe is unavoidable. Hopefully it only needs to be done during model setup.
Reviewed By: jiawenliu64, jianyuh
Differential Revision: D77239566
fbshipit-source-id: ad8eea5eb153f851f1b1e297a566fd36c0ac6409
0 commit comments