-
Notifications
You must be signed in to change notification settings - Fork 898
Open
Description
Thank you for the excellent work on the highly optimized attention kernels in FlashMLA. The performance of the sparse forward kernel flash_mla_sparse_fwd for the prefill stage is particularly impressive.
I've noticed that while the library provides a complete forward and backward pass for dense attention, but the sparse attention implementation appears to be forward-only. Will the sparse attention backward kernel be released?
Thanks.
Metadata
Metadata
Assignees
Labels
No labels