@@ -92,16 +92,16 @@ Text Embeddings Inference currently supports CamemBERT, and XLM-RoBERTa Sequence
92
92
93
93
Below are some examples of the currently supported models:
94
94
95
- | Task | Model Type | Model ID | Revision |
96
- | --------------------| -------------| ---------------------------------------------------------------------------------------------| ------------- |
97
- | Re-Ranking | XLM-RoBERTa | [ BAAI/bge-reranker-large] ( https://huggingface.co/BAAI/bge-reranker-large ) | ` refs/pr/4 ` |
98
- | Re-Ranking | XLM-RoBERTa | [ BAAI/bge-reranker-base] ( https://huggingface.co/BAAI/bge-reranker-base ) | ` refs/pr/5 ` |
99
- | Sentiment Analysis | RoBERTa | [ SamLowe/roberta-base-go_emotions] ( https://huggingface.co/SamLowe/roberta-base-go_emotions ) | |
95
+ | Task | Model Type | Model ID |
96
+ | --------------------| -------------| ---------------------------------------------------------------------------------------------|
97
+ | Re-Ranking | XLM-RoBERTa | [ BAAI/bge-reranker-large] ( https://huggingface.co/BAAI/bge-reranker-large ) |
98
+ | Re-Ranking | XLM-RoBERTa | [ BAAI/bge-reranker-base] ( https://huggingface.co/BAAI/bge-reranker-base ) |
99
+ | Sentiment Analysis | RoBERTa | [ SamLowe/roberta-base-go_emotions] ( https://huggingface.co/SamLowe/roberta-base-go_emotions ) |
100
100
101
101
### Docker
102
102
103
103
``` shell
104
- model=Alibaba-NLP/gte-base -en-v1.5
104
+ model=BAAI/bge-large -en-v1.5
105
105
volume=$PWD /data # share a volume with the Docker container to avoid downloading weights every run
106
106
107
107
docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4 --model-id $model
@@ -382,10 +382,9 @@ downstream performance.
382
382
383
383
``` shell
384
384
model=BAAI/bge-reranker-large
385
- revision=refs/pr/4
386
385
volume=$PWD /data # share a volume with the Docker container to avoid downloading weights every run
387
386
388
- docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4 --model-id $model --revision $revision
387
+ docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4 --model-id $model
389
388
```
390
389
391
390
And then you can rank the similarity between a query and a list of texts with:
@@ -451,7 +450,7 @@ found [here](https://github.com/huggingface/text-embeddings-inference/blob/main/
451
450
You can use the gRPC API by adding the ` -grpc ` tag to any TEI Docker image. For example:
452
451
453
452
``` shell
454
- model=Alibaba-NLP/gte-base -en-v1.5
453
+ model=BAAI/bge-large -en-v1.5
455
454
volume=$PWD /data # share a volume with the Docker container to avoid downloading weights every run
456
455
457
456
docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4-grpc --model-id $model
@@ -485,7 +484,7 @@ cargo install --path router -F metal
485
484
You can now launch Text Embeddings Inference on CPU with:
486
485
487
486
``` shell
488
- model=Alibaba-NLP/gte-base -en-v1.5
487
+ model=BAAI/bge-large -en-v1.5
489
488
490
489
text-embeddings-router --model-id $model --port 8080
491
490
```
@@ -523,7 +522,7 @@ cargo install --path router -F candle-cuda -F http --no-default-features
523
522
You can now launch Text Embeddings Inference on GPU with:
524
523
525
524
``` shell
526
- model=Alibaba-NLP/gte-base -en-v1.5
525
+ model=BAAI/bge-large -en-v1.5
527
526
528
527
text-embeddings-router --model-id $model --port 8080
529
528
```
0 commit comments