Skip to content

add ChatTemplateKwargs to ChatCompletionRequest #980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 13, 2025

Conversation

justa-cai
Copy link
Contributor

@justa-cai justa-cai commented Apr 30, 2025

Describe the change
add chat_template_kwargs param support

Provide OpenAI documentation link
vllm qwen3 thinking modes

curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "Qwen/Qwen3-8B",
  "messages": [
    {"role": "user", "content": "Give me a short introduction to large language models."}
  ],
  "temperature": 0.7,
  "top_p": 0.8,
  "top_k": 20,
  "max_tokens": 8192,
  "presence_penalty": 1.5,
  "chat_template_kwargs": {"enable_thinking": false}
}' 

Copy link

codecov bot commented Apr 30, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.93%. Comparing base (774fc9d) to head (9ed7474).
Report is 104 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #980       +/-   ##
===========================================
- Coverage   98.46%   85.93%   -12.53%     
===========================================
  Files          24       43       +19     
  Lines        1364     2268      +904     
===========================================
+ Hits         1343     1949      +606     
- Misses         15      300      +285     
- Partials        6       19       +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sashabaranov sashabaranov requested a review from Copilot May 3, 2025 20:57
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for a new parameter "chat_template_kwargs" to the ChatCompletionRequest structure to allow for non-standard parameters in chat completions.

  • Added the ChatTemplateKwargs field to ChatCompletionRequest to support extra configuration parameters (e.g., think mode for Qwen3).

chat.go Outdated
@@ -275,6 +275,8 @@ type ChatCompletionRequest struct {
Metadata map[string]string `json:"metadata,omitempty"`
// Configuration for a predicted output.
Prediction *Prediction `json:"prediction,omitempty"`
// ExtraBody provides a way to add non-standard parameters to the request body. such as think mode for qwen3
Copy link
Preview

Copilot AI May 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider updating this comment to reference 'ChatTemplateKwargs' to match the field name and improve grammar by capitalizing 'such' to 'Such' to enhance clarity.

Suggested change
// ExtraBody provides a way to add non-standard parameters to the request body. such as think mode for qwen3
// ChatTemplateKwargs provides a way to add non-standard parameters to the request body. Such as think mode for qwen3.

Copilot uses AI. Check for mistakes.

chat.go Outdated
@@ -275,6 +275,8 @@ type ChatCompletionRequest struct {
Metadata map[string]string `json:"metadata,omitempty"`
// Configuration for a predicted output.
Prediction *Prediction `json:"prediction,omitempty"`
// ExtraBody provides a way to add non-standard parameters to the request body. such as think mode for qwen3
ChatTemplateKwargs map[string]interface{} `json:"chat_template_kwargs,omitempty"`
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ChatTemplateKwargs map[string]interface{} `json:"chat_template_kwargs,omitempty"`
ChatTemplateKwargs map[string]any `json:"chat_template_kwargs,omitempty"`

@torrischen
Copy link

@sashabaranov
Hi, I wonder if the pr will be merged soon?
Our project is using this repo and currently we are using qwen3, we urgently need the feature to support switching between thinking and non-thinking

@sashabaranov
Copy link
Owner

@justa-cai @torrischen there are still a few pending comments and I'd also add that we should explicitly name the variable so it would be obvious that the parameter is not related to OpenAI first-party API. I'd be happy to merge after all this is fixed 🫶

@justa-cai
Copy link
Contributor Author

#ChatTemplateKwargs
Additional kwargs to pass to the template renderer. Will be accessible by the chat template.
VLLM

@justa-cai
Copy link
Contributor Author

@sashabaranov help meger

Copy link
Owner

@sashabaranov sashabaranov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for update!

@sashabaranov sashabaranov merged commit 6aaa732 into sashabaranov:master May 13, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants