Skip to content

Commit 7bba462

Browse files
committed
update doc
1 parent ae3da10 commit 7bba462

File tree

3 files changed

+46
-4
lines changed

3 files changed

+46
-4
lines changed

docs/source/en/_toctree.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,10 @@
1111
title: Using TEI locally with CPU
1212
- local: local_metal
1313
title: Using TEI locally with Metal
14-
- local: local_gpu
15-
title: Using TEI locally with GPU
14+
- local: local_nvidia_gpu
15+
title: Using TEI locally with Nvidia GPU
16+
- local: local_amd_gpu
17+
title: Using TEI locally with AMD GPU
1618
- local: private_models
1719
title: Serving private and gated models
1820
# - local: tei_cli

docs/source/en/local_amd_gpu.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
12+
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
13+
rendered properly in your Markdown viewer.
14+
15+
-->
16+
17+
# Using TEI locally with an AMD GPU
18+
19+
Text-Embeddings-Inference supports the [AMD GPUs officially supporting ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html), including AMD Instinct MI210, MI250, MI300 and some of the AMD Radeon series GPUs.
20+
21+
To leverage AMD GPUs, Text-Embeddings-Inference relies on its Python backend, and not on the [candle](https://github.com/huggingface/candle) backend that is used for CPU, Nvidia GPUs and Metal. The support in the python backend is more limited (Bert embeddings) but easily extendible. We welcome contributions to extend the supported models.
22+
23+
## Usage through docker
24+
25+
Using docker is the recommended approach.
26+
27+
```bash
28+
docker run --rm -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --net host \
29+
--device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 32g \
30+
ghcr.io/huggingface/text-embeddings-inference:rocm-1.2.4 \
31+
--model-id sentence-transformers/all-MiniLM-L6-v2
32+
```
33+
34+
and
35+
36+
```bash
37+
curl 127.0.0.1:80/embed \
38+
-X POST -d '{"inputs":"What is Deep Learning?"}' \
39+
-H 'Content-Type: application/json'
40+
```

docs/source/en/local_gpu.md renamed to docs/source/en/local_nvidia_gpu.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,9 @@ rendered properly in your Markdown viewer.
1414
1515
-->
1616

17-
# Using TEI locally with GPU
17+
# Using TEI locally with Nvidia GPU
1818

19-
You can install `text-embeddings-inference` locally to run it on your own machine with a GPU.
19+
You can install `text-embeddings-inference` locally to run it on your own machine with an Nvidia GPU.
2020
To make sure that your hardware is supported, check out the [Supported models and hardware](supported_models) page.
2121

2222
## Step 1: CUDA and NVIDIA drivers

0 commit comments

Comments
 (0)