Skip to content

[Model] Re-add the implicit conversion feature for as_seq_cls_model #21103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

noooop
Copy link
Contributor

@noooop noooop commented Jul 17, 2025

Essential Elements of an Effective PR Description Checklist

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Because: ForSequenceClassification using TRANSFORMERS Impl It is implemented using TransformersForCausalLM + as_classification_model, instead of directly using TransformersForSequenceClassification

That is, implicit conversion is essential.

  • We should implicit conversion ForSequenceClassification models, instead of adding them one by one to registry.py

  • hitchhike

    • is_matryoshka

      matryoshka_dimensions is [] or None, is_matryoshka should be false.

    • get_and_verify_max_len

      Consider max_model_len in tokenizer_config only when pooling models and using absolute position_embedding.

    • fix load_weights_no_post_processing tp weight loader

cc @DarkLight1337 @maxdebayser

Test Plan

pytest -s -vvv tests/models/test_transformers.py::test_classify
pytest -s -vvv tests/models/test_initialization.py::test_implicit_converted_models

Test Result

passed

(Optional) Documentation Update

Known Issues

ForSequenceClassification using TRANSFORMERS Impl It is implemented using TransformersForCausalLM + as_classification_model, instead of directly using TransformersForSequenceClassification

as_classification_model only supports gpt style score head, and does not support bert style classifier head, which makes vllm unable to support models like DebertaV2ForSequenceClassification

noooop added 2 commits July 17, 2025 15:55
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
@mergify mergify bot added llama Related to Llama models new-model Requests to new models qwen Related to Qwen models labels Jul 17, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request re-introduces implicit model conversion for sequence classification models, which is a great simplification. The changes involve refactoring the model registry and configuration logic to automatically handle these conversions. My review found a critical issue in the model registry where a cached object is being mutated, which could lead to incorrect behavior for other models. I've provided a detailed explanation and a suggested fix for this issue.

noooop added 2 commits July 17, 2025 16:25
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: wang.yuqi <noooop@126.com>
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a super cool PR! Thanks for enabling this for the Transformers backend!

Would it be better if the Transformes backend directly supported ForSequenceClassification instead of the substitution you're doing in vllm/model_executor/models/registry.py?

@noooop
Copy link
Contributor Author

noooop commented Jul 17, 2025

This is a super cool PR! Thanks for enabling this for the Transformers backend!

Would it be better if the Transformes backend directly supported ForSequenceClassification instead of the substitution you're doing in vllm/model_executor/models/registry.py?

The Transformers backend directly supporting ForSequenceClassification is definitely better,
allowing users to use models like DebertaV2ForSequenceClassification that vllm as_classification_model does not support.

I am not familiar with the Transformers backend and don't know what difficulties might arise.

@hmellor
Copy link
Member

hmellor commented Jul 17, 2025

Once #20543 is merged I can look at adding explicit support for ForSequenceClassification models to the Transformers backend.

To clarify, is this PR specifically about enabling the Transformers backend or is it useful for other reasons too?

@noooop
Copy link
Contributor Author

noooop commented Jul 17, 2025

This PR should wait for #21058 landing.

cc @DarkLight1337 @maxdebayser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llama Related to Llama models new-model Requests to new models qwen Related to Qwen models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants