Missing sparse backward kernel.

Thank you for the excellent work on the highly optimized attention kernels in FlashMLA. The performance of the sparse forward kernel `flash_mla_sparse_fwd` for the prefill stage is particularly impressive.

I've noticed that while the library provides a complete forward and backward pass for dense attention, but the sparse attention implementation appears to be forward-only. **Will the sparse attention backward kernel be released**? 

Thanks.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Missing sparse backward kernel. #107

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Missing sparse backward kernel. #107

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions