Document how to send batched inputs (#222)

osanseviero · OlivierDehaene · web-flow · commit 53e28e0d0478 · 2024-04-02T16:43:44.000+02:00
Co-authored-by: OlivierDehaene &lt;Olivier.dehaene@gmail.com&gt;
diff --git a/docs/source/en/quick_tour.md b/docs/source/en/quick_tour.md
@@ -39,12 +39,12 @@ docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingf
 
 <Tip>
 
-Here we pass a `revision=refs/pr/5`, because the `safetensors` variant of this model is currently in a pull request.
+Here we pass a `revision=refs/pr/5` because the `safetensors` variant of this model is currently in a pull request.
 We also recommend sharing a volume with the Docker container (`volume=$PWD/data`) to avoid downloading weights every run.
 
 </Tip>
 
-Once you have deployed a model you can use the `embed` endpoint by sending requests:
+Once you have deployed a model, you can use the `embed` endpoint by sending requests:
 
 ```bash
 curl 127.0.0.1:8080/embed \
@@ -72,7 +72,7 @@ volume=$PWD/data
 docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.2 --model-id $model --revision $revision
 ```
 
-Once you have deployed a model you can use the `rerank` endpoint to rank the similarity between a query and a list
+Once you have deployed a model, you can use the `rerank` endpoint to rank the similarity between a query and a list
 of texts:
 
 ```bash
@@ -101,3 +101,23 @@ curl 127.0.0.1:8080/predict \
     -d '{"inputs":"I like you."}' \
     -H 'Content-Type: application/json'
 ```
+
+## Batching
+
+You can send multiple inputs in a batch. For example, for embeddings
+
+```bash
+curl 127.0.0.1:8080/embed \
+    -X POST \
+    -d '{"inputs":["Today is a nice day", "I like you"]}' \
+    -H 'Content-Type: application/json'
+```
+
+And for Sequence Classification:
+
+```bash
+curl 127.0.0.1:8080/predict \
+    -X POST \
+    -d '{"inputs":[["I like you."], ["I hate pineapples"]]}' \
+    -H 'Content-Type: application/json'
+```