Skip to content

How to get LLM's embedding during batch inference? #7851

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
importrayhan opened this issue Apr 25, 2025 · 0 comments
Open
1 task done

How to get LLM's embedding during batch inference? #7851

importrayhan opened this issue Apr 25, 2025 · 0 comments
Labels
enhancement New feature or request pending This problem is yet to be addressed

Comments

@importrayhan
Copy link

Reminder

  • I have read the above rules and searched the existing issues.

Description

Sometimes the pre-embedding layers or prediction logits are more useful than the prediction tokens.
To achieve the similar features in Guidance-AI or Layer-wise Analysis, is there any possible configuration in batch inference?

For example,
--output-layer: -1
--output-layer-length: 10
to obtain prediction logits of top 10 tokens.

Pull Request

No response

@importrayhan importrayhan added enhancement New feature or request pending This problem is yet to be addressed labels Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant