-
-
Notifications
You must be signed in to change notification settings - Fork 8.9k
[CI] Add mteb testing for rerank models #19344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
60ab1e1
+ test_rerank_models_mteb
noooop 15b6840
+ bm25s
noooop ca0e8a7
disable duplicate test
noooop e6d6ace
fix
noooop 335c375
fix
noooop 7271552
fix
noooop 018a3b2
fix
noooop 89696eb
fix
noooop c3718b1
fix 3.2.1
noooop 444b0f2
Merge branch 'vllm-project:main' into reranker
noooop 3e6cfd4
upgrade st to the latest 4.1.0
noooop 0a48a24
fix
noooop a5900e8
fix
noooop 217e57c
Use float32 for torch.cumsum in MeanPool
noooop 1a4e6bb
use BAAI/bge-reranker-base for score tests
noooop 88436d9
Merge branch 'vllm-project:main' into reranker
noooop 5901d31
+ tomaarsen/Qwen3-Reranker-0.6B-seq-cls test
noooop f8c164c
refactor Qwen3-Reranker tests
noooop f409968
Merge branch 'vllm-project:main' into reranker
noooop 5921b39
fix
noooop 8519668
fix
noooop 3a94eb7
fix
noooop 2b187b0
try float32
noooop 5061458
+ tasks metadata
noooop fb9d277
Merge branch 'vllm-project:main' into reranker
noooop 90978a2
Merge branch 'vllm-project:main' into reranker
noooop 4297410
MTEB_RERANK_TOL = 1e-3
noooop f855280
Using float32 in PoolerHead
noooop f51e2ce
Merge branch 'vllm-project:main' into reranker
noooop File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project | ||
import os | ||
|
||
import pytest | ||
|
||
from tests.models.language.pooling.mteb_utils import (MTEB_RERANK_LANGS, | ||
MTEB_RERANK_TASKS, | ||
MTEB_RERANK_TOL, | ||
RerankClientMtebEncoder, | ||
ScoreClientMtebEncoder, | ||
run_mteb_rerank) | ||
from tests.utils import RemoteOpenAIServer | ||
|
||
os.environ["VLLM_LOGGING_LEVEL"] = "WARNING" | ||
|
||
MODEL_NAME = "cross-encoder/ms-marco-MiniLM-L-6-v2" | ||
MAIN_SCORE = 0.33702 | ||
noooop marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
@pytest.fixture(scope="module") | ||
def server(): | ||
args = [ | ||
"--task", "score", "--enforce-eager", "--disable-uvicorn-access-log" | ||
] | ||
|
||
with RemoteOpenAIServer(MODEL_NAME, args) as remote_server: | ||
yield remote_server | ||
|
||
|
||
def test_mteb_score(server): | ||
url = server.url_for("score") | ||
encoder = ScoreClientMtebEncoder(MODEL_NAME, url) | ||
vllm_main_score = run_mteb_rerank(encoder, MTEB_RERANK_TASKS, | ||
MTEB_RERANK_LANGS) | ||
st_main_score = MAIN_SCORE | ||
|
||
print("VLLM main score: ", vllm_main_score) | ||
print("SentenceTransformer main score: ", st_main_score) | ||
print("Difference: ", st_main_score - vllm_main_score) | ||
|
||
assert st_main_score == pytest.approx(vllm_main_score, abs=MTEB_RERANK_TOL) | ||
|
||
|
||
def test_mteb_rerank(server): | ||
url = server.url_for("rerank") | ||
encoder = RerankClientMtebEncoder(MODEL_NAME, url) | ||
vllm_main_score = run_mteb_rerank(encoder, MTEB_RERANK_TASKS, | ||
MTEB_RERANK_LANGS) | ||
st_main_score = MAIN_SCORE | ||
|
||
print("VLLM main score: ", vllm_main_score) | ||
print("SentenceTransformer main score: ", st_main_score) | ||
print("Difference: ", st_main_score - vllm_main_score) | ||
|
||
assert st_main_score == pytest.approx(vllm_main_score, abs=MTEB_RERANK_TOL) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.