Skip to content

Feature Request: Tencent Hunyuan-A13B model support #561

@Downtown-Case

Description

@Downtown-Case

80B/13B active MoE, good benchmarks. Seems right up ik_llama.cpp's alley, aka expert offloading like deepseek.

Uses a custom architecture with good old GQA and NTK rope scaling. At a glance it doesn't look like anything too exotic: https://huggingface.co/tencent/Hunyuan-A13B-Instruct/tree/main

Relevant main llama.cpp issue: ggml-org/llama.cpp#14415

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions