v0.1.2

yzhangcs released this 31 Mar 06:30

· 440 commits to main since this release

53b3ac7

What's Changed

[RWKV7] fix RWKV7Attention.__init__ by @exhyy in #238
fix(triton): remove num_warps=8 in bwd_prepare_wy_repr_kernel to avoid MMA layout assertion on non-Ampere GPUs. by @kugwzk in #240
[Fix]: reshape o before o_proj in linear_attn layer. by @Luther-Sparks in #243
[CI] Seperate tests to compile , normal and varlen by @zhiyuan1i in #247
[ABC] Add use_rope parameter to ABCAttention and ABCConfig & Fix compiler bugs in kernels by @yzhangcs in #248
[CI] trigger GPU workflow only on pull_request events by @zhiyuan1i in #249
Create test_linearatten.py by @kangyiyang in #250
[CI] Fix all erros and enable testing for PR by @zhiyuan1i in #251
[CI] add H100 GPU by @zhiyuan1i in #254
[Gated DeltaNet] fix gdn kernel bugs on h100 when vdim=64 by @kugwzk in #256
[Test] Enhance support for NVIDIA Hopper GPU by @zhiyuan1i in #257
[FAQ] Update triton-nightly links by @yzhangcs in #259
[Attn] Add triton impls for MHA/GQA by @yzhangcs in #260
[Attn] Use larger block size for hopper devices by @yzhangcs in #261
[Attn] Enable test for attn by @zhiyuan1i in #262
[CI] fix a syntax error in triton-nightly by @zhiyuan1i in #263
Bump fla to v0.1.2 by @yzhangcs in #264

New Contributors

@exhyy made their first contribution in #238
@kugwzk made their first contribution in #240
@Luther-Sparks made their first contribution in #243
@yzhangcs made their first contribution in #248
@kangyiyang made their first contribution in #250

Full Changelog: v0.1.1...v0.1.2

Contributors

kugwzk, yzhangcs, and 4 other contributors

Assets 2