Release v0.0.4 · pytorch-labs/helion

What's Changed

Beef up pre-commit checks by @oulgen in #106
Run pre-commit as part of lint action by @oulgen in #108
Add jagged_dense_add_2d example in generalize tensor indexing by @jansel in #105
Update README.md with Helion logo by @oulgen in #100
Optimization pass to remove unneeded masking by @jansel in #109
Improve mask optimization to cover control flow and inductor ops by @jansel in #111
Expand README.md by @jansel in #112
Fix ImportError: cannot import name 'Never' from 'typing' by @jansel in #114
Remove 'first_non_grid_index' for hl.grid index by @jansel in #113
Pass to remove unnecessary hl.tile_index calls by @jansel in #115
Replace torch.fx.GraphModule with torch.fx.Graph by @jansel in #116
MoE matmul example by @yf225 in #110
Add main() to moe_matmul_ogs by @yf225 in #118
Add pre-commit hook to make sure examples have a main function by @oulgen in #119
Add reduction example: Long sum by @joydddd in #92
Make loop reordering work with register_block_size by @jansel in #117
Temporarily disable unit test for moe_matmul_ogs example by @yf225 in #120
Skip test_moe_matmul_ogs on older cards by @jansel in #121
Make l2_grouping work with register_block_size by @jansel in #122
Re-enable unit test for moe_matmul_ogs example; skip in fbcode by @yf225 in #123

Full Changelog: v0.0.3...v0.0.4