Skip to content

v0.1.2

Choose a tag to compare

@yzhangcs yzhangcs released this 31 Mar 06:30
· 440 commits to main since this release
53b3ac7

What's Changed

  • [RWKV7] fix RWKV7Attention.__init__ by @exhyy in #238
  • fix(triton): remove num_warps=8 in bwd_prepare_wy_repr_kernel to avoid MMA layout assertion on non-Ampere GPUs. by @kugwzk in #240
  • [Fix]: reshape o before o_proj in linear_attn layer. by @Luther-Sparks in #243
  • [CI] Seperate tests to compile , normal and varlen by @zhiyuan1i in #247
  • [ABC] Add use_rope parameter to ABCAttention and ABCConfig & Fix compiler bugs in kernels by @yzhangcs in #248
  • [CI] trigger GPU workflow only on pull_request events by @zhiyuan1i in #249
  • Create test_linearatten.py by @kangyiyang in #250
  • [CI] Fix all erros and enable testing for PR by @zhiyuan1i in #251
  • [CI] add H100 GPU by @zhiyuan1i in #254
  • [Gated DeltaNet] fix gdn kernel bugs on h100 when vdim=64 by @kugwzk in #256
  • [Test] Enhance support for NVIDIA Hopper GPU by @zhiyuan1i in #257
  • [FAQ] Update triton-nightly links by @yzhangcs in #259
  • [Attn] Add triton impls for MHA/GQA by @yzhangcs in #260
  • [Attn] Use larger block size for hopper devices by @yzhangcs in #261
  • [Attn] Enable test for attn by @zhiyuan1i in #262
  • [CI] fix a syntax error in triton-nightly by @zhiyuan1i in #263
  • Bump fla to v0.1.2 by @yzhangcs in #264

New Contributors

Full Changelog: v0.1.1...v0.1.2