v0.1.2
What's Changed
- [RWKV7] fix
RWKV7Attention.__init__by @exhyy in #238 - fix(triton): remove num_warps=8 in bwd_prepare_wy_repr_kernel to avoid MMA layout assertion on non-Ampere GPUs. by @kugwzk in #240
- [Fix]: reshape o before o_proj in linear_attn layer. by @Luther-Sparks in #243
- [CI] Seperate tests to compile , normal and varlen by @zhiyuan1i in #247
- [ABC] Add
use_ropeparameter to ABCAttention and ABCConfig & Fix compiler bugs in kernels by @yzhangcs in #248 - [CI] trigger GPU workflow only on pull_request events by @zhiyuan1i in #249
- Create test_linearatten.py by @kangyiyang in #250
- [CI] Fix all erros and enable testing for PR by @zhiyuan1i in #251
- [CI] add H100 GPU by @zhiyuan1i in #254
- [Gated DeltaNet] fix gdn kernel bugs on h100 when vdim=64 by @kugwzk in #256
- [Test] Enhance support for NVIDIA Hopper GPU by @zhiyuan1i in #257
- [FAQ] Update triton-nightly links by @yzhangcs in #259
- [Attn] Add triton impls for MHA/GQA by @yzhangcs in #260
- [Attn] Use larger block size for hopper devices by @yzhangcs in #261
- [Attn] Enable test for attn by @zhiyuan1i in #262
- [CI] fix a syntax error in triton-nightly by @zhiyuan1i in #263
- Bump
flato v0.1.2 by @yzhangcs in #264
New Contributors
- @exhyy made their first contribution in #238
- @kugwzk made their first contribution in #240
- @Luther-Sparks made their first contribution in #243
- @yzhangcs made their first contribution in #248
- @kangyiyang made their first contribution in #250
Full Changelog: v0.1.1...v0.1.2