v0.2.2
What's Changed
- [TokenShift] support
fused_token_shiftwithvarlenby @zhiyuan1i in #373 - [Mamba] Use official init strategies by @yzhangcs in #374
- [Mamba2] Create attn layer by @yzhangcs in #375
- [Mamba] Add attn layer & fix configs by @yzhangcs in #376
- [RWKV7] Update
fused_addcmulimpls by @zhiyuan1i in #378 - [RWKV7]: Rewrite docs to match Triton codes. by @zhiyuan1i in #381
- [RWKV7] Fix convert script by @zhiyuan1i in #383
- [Misc.] Update triton-nightly.yml by @zhiyuan1i in #382
- [PaTH] Add PaTH attention model and kernel by @sustcsonglin in #384
- [Tests] Enable tests with
causal_conv1don H100 CIs by @zhiyuan1i in #385 - [GDN]: initializing
A_loganddt_biasin_init_weightsby @HanGuo97 in #380 - [Utils] Add fused pack/unpack fns by @yzhangcs in #386
- [RWKV7] Strictly initialize rwkv7 according to RWKV-LM by @zhiyuan1i in #387
- [chore] switched to
processing_classkwarg inside Trainer invocation by @timurcarstensen in #391 - [RWKV7] Update initialization to sync with latest RWKV-LM by @zhiyuan1i in #393
- [Token Shift]: Fix potential cuda kernel parameter error for varlen by @zhiyuan1i in #397
- [DeltaProduct] fix query conv cache, remove extraneous query convs by @timurcarstensen in #396
- [Misc.] Log warnings when Triton is older than 3.2.0 by @zhiyuan1i in #394
- [RWKV7]: clean
fused_addcmul_rwkv7impls by @zhiyuan1i in #404 - [README] Update FoX venue info by @zhixuan-lin in #406
- Added details to some formulas, fixed the display error of the
L2 Lossformula by @Beortext in #407 - [RWKV7] Change fp32 errors to warnings by @zhiyuan1i in #412
- [Misc.] Add
exist_ok=Trueto all models by @zhiyuan1i in #413 - Add Rodimus impl into fla by @ziHoHe in #416
- Align RWKV7 LoRA Rank Initialization with official Implementation by @WuTianyi321 in #418
- [Canon] Add triton impls by @yzhangcs in #388
- [GDN] Support Gated Value Attention (GVA) by @Rafa-zy in #421
- [RWKV7]: clean some imps by @zhiyuan1i in #420
- [RoPE] Fix out-of-boundary bugs by @yzhangcs in #423
- [RWKV] Fix
cu_seqlenswith gradient checkpoint by @zhiyuan1i in #422
New Contributors
- @timurcarstensen made their first contribution in #391
- @ziHoHe made their first contribution in #416
- @WuTianyi321 made their first contribution in #418
- @Rafa-zy made their first contribution in #421
Full Changelog: v0.2.1...v0.2.2