Skip to content

[Feat]: speculative decoding support #280

@x-0D

Description

@x-0D

Description
Add UI and backend features to enable speculative decoding support.

ggml-org/llama.cpp#10455

Use Case
Faster interference of slower models.

https://arxiv.org/abs/2211.17192

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions