Skip to content

feat: Qdrant external retriever #154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Oct 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/pr-e2e-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ jobs:
ports:
- 7687:7687
- 7474:7474
qdrant:
image: qdrant/qdrant
ports:
- 6333:6333

steps:
- name: Install graphviz package
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/scheduled-e2e-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ jobs:
credentials:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
qdrant:
image: qdrant/qdrant
ports:
- 6333:6333

steps:
- name: Install graphviz package
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
- Added support for Cohere LLM and embeddings - added optional dependency to `cohere`.
- Added support for Anthropic LLM - added optional dependency to `anthropic`.
- Added support for MistralAI LLM - added optional dependency to `mistralai`.
- Added support for Qdrant - added optional dependency to `qdrant-client`.

### Fixed
- Resolved import issue with the Vertex AI Embeddings class.
Expand Down
6 changes: 6 additions & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,12 @@ PineconeNeo4jRetriever
.. autoclass:: neo4j_graphrag.retrievers.external.pinecone.pinecone.PineconeNeo4jRetriever
:members: search

QdrantNeo4jRetriever
====================

.. autoclass:: neo4j_graphrag.retrievers.external.qdrant.qdrant.QdrantNeo4jRetriever
:members: search


********
Embedder
Expand Down
31 changes: 31 additions & 0 deletions docs/source/user_guide_rag.rst
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,8 @@ We provide implementations for the following retrievers:
- Use this retriever when vectors are saved in a Weaviate vector database
* - :ref:`PineconeNeo4jRetriever <pinecone-neo4j-retriever-user-guide>`
- Use this retriever when vectors are saved in a Pinecone vector database
* - :ref:`QdrantNeo4jRetriever <qdrant-neo4j-retriever-user-guide>`
- Use this retriever when vectors are saved in a Qdrant vector database

Retrievers all expose a `search` method that we will discuss in the next sections.

Expand Down Expand Up @@ -672,6 +674,35 @@ Pinecone Retrievers

Also see :ref:`pineconeneo4jretriever`.

.. _qdrant-neo4j-retriever-user-guide:

Qdrant Retrievers
-----------------

.. note::

In order to import this retriever, the Qdrant Python client must be installed:
`pip install qdrant-client`


.. code:: python

from qdrant_client import QdrantClient
from neo4j_graphrag.retrievers import QdrantNeo4jRetriever

client = QdrantClient(...) # construct the Qdrant client instance

retriever = QdrantNeo4jRetriever(
driver=driver,
client=client,
collection_name="my-collection",
id_property_external="neo4j_id", # The payload field that contains identifier to a corresponding Neo4j node id property
id_property_neo4j="id",
embedder=embedder,
)

See :ref:`qdrantneo4jretriever`.


Other Retrievers
===================
Expand Down
31 changes: 31 additions & 0 deletions examples/qdrant/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
### Start services locally

Run the following command to spin up Neo4j and Qdrant containers.

```bash
docker compose -f tests/e2e/docker-compose.yml up
```

### Write data (once)

Run this from the project root to write data to both Neo4J and Qdrant.

```bash
poetry run python tests/e2e/qdrant_e2e/populate_dbs.py
```

### Install Qdrant client

```bash
pip install qdrant-client
```

### Search

```bash
# search by vector
poetry run python -m examples.qdrant.vector_search

# search by text, with embeddings generated locally
poetry run python -m examples.qdrant.text_search
```
Empty file added examples/qdrant/__init__.py
Empty file.
27 changes: 27 additions & 0 deletions examples/qdrant/text_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from neo4j import GraphDatabase
from neo4j_graphrag.retrievers import QdrantNeo4jRetriever
from qdrant_client import QdrantClient

NEO4J_URL = "neo4j://localhost:7687"
NEO4J_AUTH = ("neo4j", "password")


def main() -> None:
with GraphDatabase.driver(NEO4J_URL, auth=NEO4J_AUTH) as neo4j_driver:
embedder = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
retriever = QdrantNeo4jRetriever(
driver=neo4j_driver,
client=QdrantClient(url="http://localhost:6333"),
collection_name="Jeopardy",
id_property_external="neo4j_id",
id_property_neo4j="id",
embedder=embedder, # type: ignore
)

res = retriever.search(query_text="biology", top_k=2)
print(res)


if __name__ == "__main__":
main()
25 changes: 25 additions & 0 deletions examples/qdrant/vector_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from neo4j import GraphDatabase
from neo4j_graphrag.retrievers import QdrantNeo4jRetriever
from qdrant_client import QdrantClient

from examples.embedding_biology import EMBEDDING_BIOLOGY

NEO4J_URL = "neo4j://localhost:7687"
NEO4J_AUTH = ("neo4j", "password")


def main() -> None:
with GraphDatabase.driver(NEO4J_URL, auth=NEO4J_AUTH) as neo4j_driver:
retriever = QdrantNeo4jRetriever(
driver=neo4j_driver,
client=QdrantClient(url="http://localhost:6333"),
collection_name="Jeopardy",
id_property_external="neo4j_id",
id_property_neo4j="id",
)
res = retriever.search(query_vector=EMBEDDING_BIOLOGY, top_k=2)
print(res)


if __name__ == "__main__":
main()
144 changes: 110 additions & 34 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading