Skip to content

Code and Model Release

Latest
Compare
Choose a tag to compare
@Linyou Linyou released this 30 May 21:00
· 7 commits to main since this release

🤯 We have released both v1.0 and v1.1. The new model offers even greater speed compared to FlashAttention-2, with 12.2× faster forward pass and 19.7× faster backward pass, resulting in nearly 2× inference speedup over v1.0.