🧠 New Models
What's Changed
- [GDN] Fix tiling bugs once gv applied by @yzhangcs in #589
- [Conv] Add comprehensive docstring and change default backend to triton by @zhiyuan1i in #592
- Update cumprod_householder_bwd.py by @SeepingFragranceLock in #593
- [FIX] Correct cumsum dimension in normalize_output by @sirluk in #594
- [Triton] Add autotune caching support for Triton kernels by @zhiyuan1i in #598
- [DeltaFormer] Add Model by @Nathancgy in #585
- [DeltaFormer] Replace GenerationMixin with FLAGenerationMixin and upd… by @zhiyuan1i in #600
- [Cache] Fix from_legacy_cache by @zhiyuan1i in #605
- [Deps] Make pytest an optional dependency by @wedaly in #610
- [DeltaFormer] Fixed testing ops error by @Nathancgy in #602
- [Conv] Fix potential OOB problems by @yzhangcs in #615
- [Deps] Minimize deps by @zhiyuan1i in #617
- Determine the chunk size at the kernel entry by @yzhangcs in #619
- Add KDA by @yzhangcs in #621
- [Lint] Migrate from flake8/isort to ruff for faster linting by @zhiyuan1i in #613
New Contributors
- @SeepingFragranceLock made their first contribution in #593
- @sirluk made their first contribution in #594
- @Nathancgy made their first contribution in #585
- @wedaly made their first contribution in #610
Full Changelog: v0.3.2...v0.4.0