We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 6a65742 commit 24cfc76Copy full SHA for 24cfc76
attn_gym/mods/latent_attention.py
@@ -1,4 +1,4 @@
1
-"""Implementation of Multi-head Level Attention (MLA) RoPE score modification from DeepSeek-V2.
+"""Implementation of Multi-head Latent Attention (MLA) RoPE score modification from DeepSeek-V2.
2
3
Reference: https://arxiv.org/pdf/2405.04434 - DeepSeek-V2: A Strong, Economical, and
4
Efficient Mixture-of-Experts Language Model
0 commit comments