[CI Failure]: Language Models Test (Extended Pooling)

### Name of failing test

See below

### Basic information

- [ ] Flaky test
- [ ] Can reproduce locally
- [ ] Caused by external libraries (e.g. bug in `transformers`)

### 🧪 Describe the failing test

Remaining failures:
```
FAILED models/language/pooling/test_scoring.py::test_cross_encoder_1_to_1[cross-encoder/ms-marco-MiniLM-L-6-v2] - assert 9.265625 == 1.0 ± 1.0e-02
  comparison failed
  Obtained: 9.265625
  Expected: 1.0 ± 1.0e-02
FAILED models/language/pooling/test_scoring.py::test_cross_encoder_1_to_N[cross-encoder/ms-marco-MiniLM-L-6-v2] - assert 9.265625 == 1.0 ± 1.0e-02
  comparison failed
  Obtained: 9.265625
  Expected: 1.0 ± 1.0e-02
FAILED models/language/pooling/test_scoring.py::test_cross_encoder_N_to_N[cross-encoder/ms-marco-MiniLM-L-6-v2] - assert 9.265625 == 1.0 ± 1.0e-02
  comparison failed
  Obtained: 9.265625
  Expected: 1.0 ± 1.0e-02
```

Fixed by #20168:
```
FAILED models/language/pooling/test_embedding.py::test_models[False-sentence-transformers/all-MiniLM-L12-v2] - pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
  Value error, User-specified max_model_len (512) is greater than the derived max_model_len (max_position_embeddings=128 or model_max_length=None in model's config.json). This may lead to incorrect model outputs or CUDA errors. To allow overriding this maximum, set the env var VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 [type=value_error, input_value=ArgsKwargs((), {'model': ...attention_dtype': None}), input_type=ArgsKwargs]
    For further information visit https://errors.pydantic.dev/2.11/v/value_error
FAILED models/language/pooling/test_embedding.py::test_models[False-sentence-transformers/stsb-roberta-base-v2] - pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
  Value error, User-specified max_model_len (512) is greater than the derived max_model_len (max_position_embeddings=75 or model_max_length=None in model's config.json). This may lead to incorrect model outputs or CUDA errors. To allow overriding this maximum, set the env var VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 [type=value_error, input_value=ArgsKwargs((), {'model': ...attention_dtype': None}), input_type=ArgsKwargs]
    For further information visit https://errors.pydantic.dev/2.11/v/value_error
FAILED models/language/pooling/test_gte.py::test_embed_models_mteb[model_info9] - RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
FAILED models/language/pooling/test_gte.py::test_embed_models_mteb[model_info10] - RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
FAILED models/language/pooling/test_gte.py::test_embed_models_correctness[model_info9] - RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
FAILED models/language/pooling/test_gte.py::test_embed_models_correctness[model_info10] - RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
```

### 📝 History of failing test

Failing since 27th June, e.g. https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests/7c3bdcad-8f70-86e6-b83a-d0f0ab07fd71?period=7days&tags=scm.branch%3Amain

### CC List.

@noooop can you take a look at this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[CI Failure]: Language Models Test (Extended Pooling) #20461

Name of failing test

Basic information

🧪 Describe the failing test

📝 History of failing test

CC List.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[CI Failure]: Language Models Test (Extended Pooling) #20461

Description

Name of failing test

Basic information

🧪 Describe the failing test

📝 History of failing test

CC List.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions