What is the benefit to store hidden states and attention weights? #101
-
Thanks for your project! After reading your paper, I find that MemOS can store KV cache, hidden states and attention weights. I get that KV cache can be used to reload the conversation history. However, I cannot understand the usage of storing hidden states and attention weights. Can you help explain that? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
@MarrytheToilet, please try to answer this user's question. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your question. In our classification of memory types, we divide them into three categories: plaintext memory, activation memory, and parameter memory. Among these, activation memory can include forms such as KV-caches, hidden states, and attention weights. In the current version of MemOS, KV-cache is the primary form of activation memory used. This is due to its stable performance, interpretability, and the maturity of research enabling its effective and fast integration. Other forms of activation memory, such as hidden states and attention weights, are still under active research and are not yet mature enough for stable engineering use. That said, recent studies have demonstrated that hidden states can serve as carriers of memory. For example, by injecting steering vectors into specific layers during inference, a model can be guided to generate content in a particular style—an approach often understood as controlling abstract or stylistic memory forms [1][2]. However, both hidden states and attention weights are not yet ready for large-scale deployment, which is why the current development of MemOS focuses primarily on KV-cache-based acceleration of inference and compatibility with frameworks like Hugging Face’s References[1] Panickssery, N., Gabrieli, N., Schulz, J., Tong, M., Hubinger, E., & Turner, A. M. (2024). Steering Llama 2 via Contrastive Activation Addition. arXiv:2312.06681. https://arxiv.org/abs/2312.06681 |
Beta Was this translation helpful? Give feedback.
Thank you for your question.
In our classification of memory types, we divide them into three categories: plaintext memory, activation memory, and parameter memory. Among these, activation memory can include forms such as KV-caches, hidden states, and attention weights.
In the current version of MemOS, KV-cache is the primary form of activation memory used. This is due to its stable performance, interpretability, and the maturity of research enabling its effective and fast integration. Other forms of activation memory, such as hidden states and attention weights, are still under active research and are not yet mature enough for stable engineering use.
That said, recent studies have demons…