[Bug]: `jinaai/jina-reranker-v2-base-multilingual` doesn't support long context above 1024 tokens

### Your current environment

vLLM Production Stack on Kube with Helm. 

Helm values:

```yaml
    servingEngineSpec:
      runtimeClassName: ''
      modelSpec:
        - name: bge-reranker-v2-m3
          repository: vllm/vllm-openai
          tag: v0.9.1
          modelURL: BAAI/bge-reranker-v2-m3
          replicaCount: 1
          requestCPU: 8
          requestMemory: 16Gi
          requestGPU: 1
        - name: jina-reranker-v2-base-multilingual
          repository: vllm/vllm-openai
          tag: v0.9.1
          modelURL: jinaai/jina-reranker-v2-base-multilingual
          replicaCount: 1
          requestCPU: 8
          requestMemory: 8Gi
          requestGPU: 1
          vllmConfig:
            extraArgs:
              - --trust-remote-code
  ```

### 🐛 Describe the bug

[`jinaai/jina-reranker-v2-base-multilingual`](https://huggingface.co/jinaai/jina-reranker-v2-base-multilingual) is no supporting long context reranking above 1024 with vLLM while outside vLLM (with Transformers) it will support it.

This is the error we get from `/v1/rerank` when invoking the request. We tested with [`BAAI/bge-reranker-v2-m3`](https://huggingface.co/BAAI/bge-reranker-v2-m3) which ends up working nicely without issue with long context (>1024 tokens), it's just long context mode (>1024 tokens) on the jina reranking model that fails.

```
BadRequestError: status_code: 400, body: {'object': 'error', 'message': "This model's maximum context length is 1024 tokens. However, you requested 1301 tokens in the input for embedding generation. Please reduce the length of the input.", 'type': 'BadRequestError', 'param': None, 'code': 400}
```

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: `jinaai/jina-reranker-v2-base-multilingual` doesn't support long context above 1024 tokens #20300

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: jinaai/jina-reranker-v2-base-multilingual doesn't support long context above 1024 tokens #20300

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: `jinaai/jina-reranker-v2-base-multilingual` doesn't support long context above 1024 tokens #20300