🤯 We have released both v1.0 and v1.1. The new model offers even greater speed compared to FlashAttention-2, with 12.2× faster forward pass and 19.7× faster backward pass, resulting in nearly 2× inference speedup over v1.0.
🤯 We have released both v1.0 and v1.1. The new model offers even greater speed compared to FlashAttention-2, with 12.2× faster forward pass and 19.7× faster backward pass, resulting in nearly 2× inference speedup over v1.0.