HydraNet is a state-of-the-art transformer architecture that combines Multi-Query Attention (MQA), Mixture of Experts (MoE), and continuous learning capabilities.
-
Updated
Sep 15, 2025 - Shell
HydraNet is a state-of-the-art transformer architecture that combines Multi-Query Attention (MQA), Mixture of Experts (MoE), and continuous learning capabilities.
Add a description, image, and links to the liquid-models topic page so that developers can more easily learn about it.
To associate your repository with the liquid-models topic, visit your repo's landing page and select "manage topics."