Feature Request: Access to Token-Level Embeddings from gemini-embedding-001

### Description of the feature request:


Currently, the gemini-embedding-001 endpoint returns a single, aggregated vector representing the entire text input. This request is to add a feature, likely through an API parameter such as output_representation="token" or return_token_embeddings=True, that allows users to retrieve the sequence of embeddings for each individual token in the input text.

The desired output would be the un-pooled last hidden state of the model, resulting in a list or tensor of vectors, where each vector corresponds to a token from the input. This is standard functionality in many open-source transformer models and is essential for tasks requiring token-level analysis.

### What problem are you trying to solve with this feature?


The primary problem this feature solves is the inability to perform fine-grained, token-level NLP tasks. The current sentence-level embedding is unsuitable for this.

My specific use case is Word Sense Disambiguation (WSD):
To differentiate between meanings of a word like "Bank" (a financial institution vs. a river bank), I need to extract the contextual vector for the specific token "Bank" within each sentence. By clustering these token-specific vectors, I can automatically identify the different senses of the word. The current aggregated sentence embedding makes this impossible as the word's specific representation is lost.

### Any other information you'd like to share?


_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Access to Token-Level Embeddings from gemini-embedding-001 #753

Description of the feature request:

What problem are you trying to solve with this feature?

Any other information you'd like to share?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Access to Token-Level Embeddings from gemini-embedding-001 #753

Description

Description of the feature request:

What problem are you trying to solve with this feature?

Any other information you'd like to share?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions