How to get LLM's embedding during batch inference? #7851

importrayhan · 2025-04-25T10:03:04Z

Reminder

I have read the above rules and searched the existing issues.

Description

Sometimes the pre-embedding layers or prediction logits are more useful than the prediction tokens.
To achieve the similar features in Guidance-AI or Layer-wise Analysis, is there any possible configuration in batch inference?

For example,
--output-layer: -1
--output-layer-length: 10
to obtain prediction logits of top 10 tokens.

Pull Request

No response

The text was updated successfully, but these errors were encountered:

importrayhan added enhancement New feature or request pending This problem is yet to be addressed labels Apr 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get LLM's embedding during batch inference? #7851

How to get LLM's embedding during batch inference? #7851

importrayhan commented Apr 25, 2025

How to get LLM's embedding during batch inference? #7851

How to get LLM's embedding during batch inference? #7851

Comments

importrayhan commented Apr 25, 2025

Reminder

Description

Pull Request