Skip to content

v0.4.0 - Aquiles-RAG

Latest

Choose a tag to compare

@FredyRivera-dev FredyRivera-dev released this 13 Sep 02:09

Release v0.4.0 🥳💥 - Official PostgreSQL support, send and query with metadata for Redis, Qdrant and PostgreSQL and integration of a Reranker API

In this release, official support for PostgreSQL has been added as a backend, support for sending and querying with metadata to the RAG has been added (optional), and a Reranker API was introduced — a fast, model-based second-stage ranker (Fastembed runtime + server-side configuration).

Many of the commits and changes are described in detail in this issue: #3
Docs for v0.4.0: https://aquiles-ai.github.io/aqRAG-docs/

Highlights

  • PostgreSQL backend (first-class):

    • Full support to persist vectors and metadata in PostgreSQL (e.g. via pgvector + JSONB for metadata).
    • Connection pool configuration supported (min/max pool size, max queries, timeout).
    • UI reflects PostgreSQL connection fields and indices/tables.
  • Metadata support for send and query:

    • metadata can be supplied when ingesting chunks (/rag/create) and as a filter when searching (/rag/query-rag).
    • Allowed metadata keys: author, language, topics, source, created_at, extra.
    • Backends persist and/or filter metadata according to their capabilities (Redis fields/JSON, Qdrant payloads, PostgreSQL JSONB).
  • Reranker API (/v1/rerank):

    • New endpoint for second-stage scoring of (query, doc_text) pairs.
    • Server-side configuration options exposed: rerank, provider_re, reranker_model, max_concurrent_request, reranker_preload.
    • Supports lazy load (background) and preload modes; responds 202 Accepted while loading asynchronously when configured.
  • Setup Wizard & UI improvements:

    • Interactive aquiles-rag configs wizard now supports PostgreSQL in addition to Redis and Qdrant.
    • Wizard collects reranker options and persists them in the standard config.
    • /ui/configs and /ui/configs POST accept and show PostgreSQL-specific fields (pool & connection settings).
  • Client libraries updated:

    • AquilesRAG.query(...) and send_rag(...) now accept metadata and embedding_model parameters.
    • AquilesRAG.reranker(...) added — client helper that calls /v1/rerank.
    • Async client (AsyncAquilesRAG) mirrors the same new parameters and reranker method.
  • Docs & examples updated across installation.md, client.md, asynclient.md, and api.md to reflect new inputs, responses and behavior.

API changes (backwards-compatible)

  • POST /rag/create

    • New optional body field: metadata (object). Example:

      {
        "index": "my_idx",
        "name_chunk": "doc1_1",
        "dtype": "FLOAT32",
        "chunk_size": 1024,
        "raw_text": "Chunk text...",
        "embeddings": [0.12, 0.23, ...],
        "embedding_model": "text-embedding-v1",
        "metadata": { "author": "Alice", "language": "EN", "topics": ["RAG","LLM"] }
      }
  • POST /rag/query-rag

    • New optional body field: metadata — server will apply backend-appropriate filtering (Redis field/JSON queries, Qdrant payload filters, PostgreSQL JSONB/SQL filters).

    • Example:

      {
        "index": "my_idx",
        "embeddings": [0.1, 0.2, ...],
        "dtype": "FLOAT32",
        "top_k": 10,
        "cosine_distance_threshold": 0.65,
        "embedding_model": "text-embedding-v1",
        "metadata": { "author": "Alice", "language": "EN" }
      }
  • POST /v1/rerank

    • New endpoint: accepts rerankerjson: [[query, doc_text], ...] and returns list of {query, doc, score} objects (or 202 while loading).

    • Example:

      {
        "rerankerjson": [
          ["What is LLaDA?", "Candidate document text..."],
          ["Explain diffusion", "Another doc text..."]
        ]
      }

Client usage (quick examples)

  • Send with metadata:
client.send_rag(
  embedding_func=my_emb_fn,
  index="my_idx",
  name_chunk="doc",
  raw_text="Very long text ...",
  embedding_model="text-embedding-v1",
  metadata={"author":"Alice", "language":"EN", "topics": list({"RAG"})}
)
  • Query with metadata:
client.query(
  index="my_idx",
  embedding=q_emb,
  top_k=5,
  embedding_model="text-embedding-v1",
  metadata={"language":"EN"}
)
  • Rerank:
resp = client.reranker("Tell me about LLaDA", query_results_or_list_of_docs)

Async client mirrors the same API surface (AsyncAquilesRAG.send_rag, query, reranker).

Migration & Upgrade notes

  1. Config file: re-run the interactive wizard (aquiles-rag configs) to add PostgreSQL or reranker settings to your ~/.local/share/aquiles/aquiles_config.json, or edit the config with the UI.
  2. PostgreSQL prerequisites: if you plan to use PostgreSQL for vectors, ensure your DB has appropriate extensions (e.g. pgvector) and that your schema supports a vector column + JSONB for metadata. Tune pool settings (min_size, max_size, max_queries, timeout) in the wizard or UI.
  3. Reranker behavior: the reranker may be configured to preload into memory (reranker_preload=true) or load async on demand. If you set reranker_preload=false the first call may return 202 Accepted while the model loads in background.
  4. Clients: upgrade your package and update any direct usages to include the new optional metadata parameters where needed. Old calls without metadata remain valid.
  5. Version pin: pip install --upgrade aquiles-rag==0.4.0

Changelog (summary)

  • Added PostgreSQL backend wiring and UI/CLI support.
  • Added metadata field to /rag/create and /rag/query-rag and client methods.
  • Added reranker API and client helpers; added reranker configuration options.
  • Setup Wizard extended to include PostgreSQL and reranker options.
  • Updated client and async client to accept metadata and reranker calls.
  • Updated docs: installation.md, client.md, asynclient.md, api.md, UI configs.

Thanks & Credits

Thanks to everyone for the long wait for these new features. We look forward to continuing to improve Aquiles-RAG to make it the best RAG runtime. Let's keep iterating!

If anything breaks or you need an immediate patch, open an issue (or reference #3) and we’ll triage it quickly.