Generic Questions on GPU Indexing Speed #41721
Replies: 2 comments
-
In Milvus, search() interface allows users to input a vector list to search:
We call the length of the vector list "nq". So, there is not much difference if you test with a small nq. You can try a large nq=1000
|
Beta Was this translation helpful? Give feedback.
-
Scenario good for GPU index:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
GPU Indexing
In Milvus v2.5.x documentation GPU Index, it shows that GPUs can help increase the throughput of indexing. However, I did some experiments on my local setup, and the indexing speed for the GPU algorithms seems to be almost the same as the CPU ones.
Server Configuration
The server I used has the following configuration:
Milvus Setup
The Milvus we used is version 2.5.9, with setup following the documents here.
Workload Description
The dataset we used contains around 28.5M entries, we only inserted the embeddings along with the unique ID into the database.
Experiment Process
Using the
pymilvus
interface, we created 32000 requests in total beforehand, performing vector search by using 32 threads to send requests in parallel. We batch 100 searches in each of the requests.Profiling Results
Basic Tests with one GPU
The following experiments are done with one of the H100 used and it seems that the GPU indexing does not accelerate the query speed much.

Increasing vector dimension (still one GPU used)
We observed that the GPU occupation is quite low and thought that it might be because there is not enough computation compared to graph searching, so we proceeded to conduct the experiment with larger vector dimensions and limited the number of CPU cores. We limit the number of cores that can be used by the
standalone
service to 32 by changing the following fields indocker-compose.yml
.The results are as follows:

Increasing the number of GPUs used
We also tried to use multiple GPUs, but it seems like the performance is almost the same (even a little bit slower) with multiple GPUs. In the docker setup, we changed the following field in
docker-compose.yml
The results are as follows:

Questions
Beta Was this translation helpful? Give feedback.
All reactions