Adding support for DeepSeek-V2 MLA #8589

Azirine · 2024-07-19T13:35:24Z

Azirine
Jul 19, 2024

DeepSeek-V2-Chat-0628 currently uses excessive VRAM, possibly due to running as MHA instead of MLA.

Discussion here:
https://old.reddit.com/r/LocalLLaMA/comments/1e6ba6a/deepseekv2chat0628_weight_release_1_open_weight/ldtybpo/

ggerganov · 2024-07-22T10:29:03Z

ggerganov
Jul 22, 2024
Maintainer

I think we already support MLA. What makes you think that we use excessive VRAM?

0 replies

Azirine · 2024-07-23T00:26:43Z

Azirine
Jul 23, 2024
Author

KV Cache sizes are extremely big, as seen here:
https://old.reddit.com/r/LocalLLaMA/comments/1e6ba6a/deepseekv2chat0628_weight_release_1_open_weight/ldv84z5/

I'm running Q3_K_S (101.7 GB) with M3 Max 128 GB memory and 122 GB allocated as VRAM, it swaps with small context size (<=2k, even with 256 for some reason). 4k/8k are unusable.

1 reply

ggerganov Jul 24, 2024
Maintainer

Ah yes, we are not storing the compressed data. This most likely has to wait after the KV cache functionality is refactored to allow overloading it more easily

fairydreaming · 2025-01-27T09:49:29Z

fairydreaming
Jan 27, 2025
Collaborator

@Azirine check out this: #11446

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding support for DeepSeek-V2 MLA #8589

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Adding support for DeepSeek-V2 MLA #8589

Uh oh!

Azirine Jul 19, 2024

Replies: 3 comments · 1 reply

Uh oh!

ggerganov Jul 22, 2024 Maintainer

Uh oh!

Uh oh!

Azirine Jul 23, 2024 Author

Uh oh!

ggerganov Jul 24, 2024 Maintainer

Uh oh!

fairydreaming Jan 27, 2025 Collaborator

Azirine
Jul 19, 2024

Replies: 3 comments 1 reply

ggerganov
Jul 22, 2024
Maintainer

Azirine
Jul 23, 2024
Author

ggerganov Jul 24, 2024
Maintainer

fairydreaming
Jan 27, 2025
Collaborator