Skip to content

Feature Request: Improve Sampling API: Expose Top‑K/Top‑P Candidate Token Lists in C API #14612

@officiallyutso

Description

@officiallyutso

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Currently, logprobs via the OpenAI-style wrapper allows users to see probabilities of generated tokens: great for research & debugging. However, there is no straightforward way at the native C API / CLI level to access the full list of candidate tokens and their probabilities, especially before sampling decisions (e.g., top‑K or nucleus sampling options).

Having direct access to these candidate distributions would:

Enable confidence-based stopping criteria

Facilitate custom sampling / selective decoding in application code

Provide better transparency into internal generation decisions

Motivation

This feature empowers developers to:

  • Inspect model confidence before outputting tokens

  • Implement advanced sampling like dynamic beam filtering

  • Writing more explainable LLM-based systems

While logprobs support exists in the wrapper, exposing candidate distributions natively ensures broader accessibility (via CLI, C API, or other bindings).

Possible Implementation

Extend llama_sample_token / llama_sample_token_greedy (or create variants) to return a struct containing:

  • token_id

  • logit (or prob after softmax)

  • is_selected flag

Add equivalent CLI flags (e.g. --print-topk 10)

Expose the functionality in Python/C bindings consistent with high-level logprobs usage

Benchmark to ensure no significant inference slowdowns when the feature is inactive

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions