Skip to content

[RFC] common, server : add top-a sampler #5612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Artefact2
Copy link
Collaborator

Top-A sampling is an interesting sampling technique that behaves like a dynamic Min-P. More details here: https://github.com/BlinkDL/RWKV-LM/tree/4cb363e5aa31978d801a47bc89d28e927ab6912e#the-top-a-sampling-method

When the most likely token has a high probability, this behaves like min-p with high P. When the most likely token has a low probability, this behaves like min-p with low P. The a parameter selects the cutoff point (a*P^2 where P is the probability of the most likely token).

This commit adds top-a sampling to llama.cpp and server.cpp, which improves compatibility with clients like AI Horde. Because this sampler works very similarly to min-p, adding it is mostly free in terms of actual code changes.

This improves compatibility with clients like AI Horde.
@Artefact2 Artefact2 force-pushed the artefact-min-p-sq branch from 11f1859 to 8d0033a Compare March 9, 2024 14:38
@mofosyne mofosyne added Review Complexity : High Generally require indepth knowledge of LLMs or GPUs enhancement New feature or request generation quality Quality of model output labels May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request generation quality Quality of model output Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants