Skip to content

Conversation

@ChenZiHong-Gavin
Copy link
Collaborator

@ChenZiHong-Gavin ChenZiHong-Gavin commented Oct 27, 2025

This PR introduces several LLM api client and inference backends, including:

  1. http_client
  2. ollama_client
  3. openai_client
  4. hf
  5. sglang
  6. tgi(wip)
  7. trt(wip)
  8. vllm(wip)

.env changes can be found in .env.example

# Tokenizer
TOKENIZER_MODEL=

# LLM
# Support different backends: http_api, openai_api, ollama_api, ollama, huggingface, tgi, sglang, tensorrt

# http_api / openai_api
SYNTHESIZER_BACKEND=openai_api
SYNTHESIZER_MODEL=gpt-4o-mini
SYNTHESIZER_BASE_URL=
SYNTHESIZER_API_KEY=
TRAINEE_BACKEND=openai_api
TRAINEE_MODEL=gpt-4o-mini
TRAINEE_BASE_URL=
TRAINEE_API_KEY=

# # ollama_api
# SYNTHESIZER_BACKEND=ollama_api
# SYNTHESIZER_MODEL=gemma3
# SYNTHESIZER_BASE_URL=http://localhost:11434
#
# Note: TRAINEE with ollama_api backend is not supported yet as ollama_api does not support logprobs.

# # huggingface
# SYNTHESIZER_BACKEND=huggingface
# SYNTHESIZER_MODEL=Qwen/Qwen2.5-0.5B-Instruct
#
# TRAINEE_BACKEND=huggingface
# TRAINEE_MODEL=Qwen/Qwen2.5-0.5B-Instruct

@ChenZiHong-Gavin ChenZiHong-Gavin marked this pull request as ready for review October 29, 2025 11:25
@ChenZiHong-Gavin ChenZiHong-Gavin merged commit 6e4a142 into main Oct 29, 2025
3 checks passed
@ChenZiHong-Gavin ChenZiHong-Gavin deleted the feature/inference-backend branch October 29, 2025 11:25
@tpoisonooo
Copy link
Collaborator

  1. 对应的 README 要更新,支持哪些 feature 得能看到
  2. 还有个有效的是自动扩展负载, 例如调一下 ray[serve] api

https://github.com/SeedLLM/DataPolisher/pull/1/files#diff-5648623a11374bdc84a573cac0a89d4e93d162c80c8938c82780f76c96c4373c

@tpoisonooo
Copy link
Collaborator

这种:
image

或者这种:
image

都可以参考。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants