Skip to content

Commit 82fd2b1

Browse files
fix: do not automatically set latest
1 parent aec5efd commit 82fd2b1

File tree

7 files changed

+23
-13
lines changed

7 files changed

+23
-13
lines changed

.github/workflows/build_75.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,8 @@
108108
images: |
109109
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
110110
ghcr.io/huggingface/text-embeddings-inference
111+
flavor: |
112+
latest=false
111113
tags: |
112114
type=semver,pattern=turing-{{version}}
113115
type=semver,pattern=turing-{{major}}.{{minor}}

.github/workflows/build_80.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -105,11 +105,11 @@
105105
id: meta-80
106106
uses: docker/metadata-action@v4.3.0
107107
with:
108-
flavor: |
109-
latest=auto
110108
images: |
111109
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
112110
ghcr.io/huggingface/text-embeddings-inference
111+
flavor: |
112+
latest=false
113113
tags: |
114114
type=semver,pattern={{version}}
115115
type=semver,pattern={{major}}.{{minor}}

.github/workflows/build_86.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,8 @@
108108
images: |
109109
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
110110
ghcr.io/huggingface/text-embeddings-inference
111+
flavor: |
112+
latest=false
111113
tags: |
112114
type=semver,pattern=86-{{version}}
113115
type=semver,pattern=86-{{major}}.{{minor}}

.github/workflows/build_89.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,8 @@
108108
images: |
109109
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
110110
ghcr.io/huggingface/text-embeddings-inference
111+
flavor: |
112+
latest=false
111113
tags: |
112114
type=semver,pattern=89-{{version}}
113115
type=semver,pattern=89-{{major}}.{{minor}}

.github/workflows/build_90.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,8 @@
108108
images: |
109109
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
110110
ghcr.io/huggingface/text-embeddings-inference
111+
flavor: |
112+
latest=false
111113
tags: |
112114
type=semver,pattern=hopper-{{version}}
113115
type=semver,pattern=hopper-{{major}}.{{minor}}

.github/workflows/build_cpu.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,8 @@
108108
images: |
109109
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
110110
ghcr.io/huggingface/text-embeddings-inference
111+
flavor: |
112+
latest=false
111113
tags: |
112114
type=semver,pattern=cpu-{{version}}
113115
type=semver,pattern=cpu-{{major}}.{{minor}}

README.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ model=BAAI/bge-large-en-v1.5
8181
revision=refs/pr/5
8282
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
8383

84-
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:latest --model-id $model --revision $revision
84+
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.2.2 --model-id $model --revision $revision
8585
```
8686

8787
And then you can make requests like
@@ -223,15 +223,15 @@ Options:
223223

224224
Text Embeddings Inference ships with multiple Docker images that you can use to target a specific backend:
225225

226-
| Architecture | Image |
227-
|-------------------------------------|-------------------------------------------------------------|
228-
| CPU | ghcr.io/huggingface/text-embeddings-inference:cpu-latest |
229-
| Volta | NOT SUPPORTED |
230-
| Turing (T4, RTX 2000 series, ...) | ghcr.io/huggingface/text-embeddings-inference:turing-latest |
231-
| Ampere 80 (A100, A30) | ghcr.io/huggingface/text-embeddings-inference:latest |
232-
| Ampere 86 (A10, A40, ...) | ghcr.io/huggingface/text-embeddings-inference:86-latest |
233-
| Ada Lovelace (RTX 4000 series, ...) | ghcr.io/huggingface/text-embeddings-inference:89-latest |
234-
| Hopper (H100) | ghcr.io/huggingface/text-embeddings-inference:hopper-latest |
226+
| Architecture | Image |
227+
|-------------------------------------|------------------------------------------------------------|
228+
| CPU | ghcr.io/huggingface/text-embeddings-inference:cpu-0.2.2 |
229+
| Volta | NOT SUPPORTED |
230+
| Turing (T4, RTX 2000 series, ...) | ghcr.io/huggingface/text-embeddings-inference:turing-0.2.2 |
231+
| Ampere 80 (A100, A30) | ghcr.io/huggingface/text-embeddings-inference:0.2.2 |
232+
| Ampere 86 (A10, A40, ...) | ghcr.io/huggingface/text-embeddings-inference:86-0.2.2 |
233+
| Ada Lovelace (RTX 4000 series, ...) | ghcr.io/huggingface/text-embeddings-inference:89-0.2.2 |
234+
| Hopper (H100) | ghcr.io/huggingface/text-embeddings-inference:hopper-0.2.2 |
235235

236236
### API documentation
237237

@@ -256,7 +256,7 @@ model=<your private model>
256256
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
257257
token=<your cli READ token>
258258

259-
docker run --gpus all -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:latest --model-id $model
259+
docker run --gpus all -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.2.2 --model-id $model
260260
```
261261

262262
### Distributed Tracing

0 commit comments

Comments
 (0)