Skip to content

[RFC][Feature][Model] Add templated fallback HF transformers model backend in SRT #5471

@XuehaiPan

Description

@XuehaiPan

Checklist

Feature Request

This issue requests a feature that SRT can out-of-the-box support new models without extra maintenance effort when a new open-source model is released. It would help to improve the model coverage on the Hugging Face repo.

This will be similar to what we (Hugging Face Transformers maintainers) do for vLLM.

Road Map

  • Initial TransformersModel backend for LM based on Transformers - Attention Interface.
  • Initial support for popular VLMs.
  • SRT DP support for TransformersModel.
  • SRT PP support for TransformersModel.

Feel free to comment and update the RoadMap.

cc Transformers maintainers @LysandreJik @ArthurZucker
cc SGLang Team for co- code review @merrymercy @Ying1123 @zhyncs

Related resources

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions