Skip to content

Commit 7801119

Browse files
kekeboomboommarkpollack
authored andcommitted
Clarify hugging face TGI model support requirements
Add explicit model compatibility information to prevent failed deployments with unsupported architectures. Point users to standard endpoints as alternative for unsupported models.
1 parent c979d1c commit 7801119

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/huggingface.adoc

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
= Hugging Face Chat
22

3-
Hugging Face Inference Endpoints allow you to deploy and serve machine learning models in the cloud, making them accessible via an API.
3+
Hugging Face Text Generation Inference (TGI) is a specialized deployment solution for serving Large Language Models (LLMs) in the cloud, making them accessible via an API. TGI provides optimized performance for text generation tasks through features like continuous batching, token streaming, and efficient memory management.
4+
5+
IMPORTANT: Text Generation Inference requires models to be compatible with its architecture-specific optimizations. While many popular LLMs are supported, not all models on Hugging Face Hub can be deployed using TGI. If you need to deploy other types of models, consider using standard Hugging Face Inference Endpoints instead.
6+
7+
TIP: For a complete and up-to-date list of supported models and architectures, see the link:https://huggingface.co/docs/text-generation-inference/en/supported_models[Text Generation Inference supported models documentation].
48

59
== Prerequisites
610

0 commit comments

Comments
 (0)