Skip to content

Releases: Aquiles-ai/Aquiles-RAG

v0.4.0 - Aquiles-RAG

13 Sep 02:09

Choose a tag to compare

Release v0.4.0 🥳💥 - Official PostgreSQL support, send and query with metadata for Redis, Qdrant and PostgreSQL and integration of a Reranker API

In this release, official support for PostgreSQL has been added as a backend, support for sending and querying with metadata to the RAG has been added (optional), and a Reranker API was introduced — a fast, model-based second-stage ranker (Fastembed runtime + server-side configuration).

Many of the commits and changes are described in detail in this issue: #3
Docs for v0.4.0: https://aquiles-ai.github.io/aqRAG-docs/

Highlights

  • PostgreSQL backend (first-class):

    • Full support to persist vectors and metadata in PostgreSQL (e.g. via pgvector + JSONB for metadata).
    • Connection pool configuration supported (min/max pool size, max queries, timeout).
    • UI reflects PostgreSQL connection fields and indices/tables.
  • Metadata support for send and query:

    • metadata can be supplied when ingesting chunks (/rag/create) and as a filter when searching (/rag/query-rag).
    • Allowed metadata keys: author, language, topics, source, created_at, extra.
    • Backends persist and/or filter metadata according to their capabilities (Redis fields/JSON, Qdrant payloads, PostgreSQL JSONB).
  • Reranker API (/v1/rerank):

    • New endpoint for second-stage scoring of (query, doc_text) pairs.
    • Server-side configuration options exposed: rerank, provider_re, reranker_model, max_concurrent_request, reranker_preload.
    • Supports lazy load (background) and preload modes; responds 202 Accepted while loading asynchronously when configured.
  • Setup Wizard & UI improvements:

    • Interactive aquiles-rag configs wizard now supports PostgreSQL in addition to Redis and Qdrant.
    • Wizard collects reranker options and persists them in the standard config.
    • /ui/configs and /ui/configs POST accept and show PostgreSQL-specific fields (pool & connection settings).
  • Client libraries updated:

    • AquilesRAG.query(...) and send_rag(...) now accept metadata and embedding_model parameters.
    • AquilesRAG.reranker(...) added — client helper that calls /v1/rerank.
    • Async client (AsyncAquilesRAG) mirrors the same new parameters and reranker method.
  • Docs & examples updated across installation.md, client.md, asynclient.md, and api.md to reflect new inputs, responses and behavior.

API changes (backwards-compatible)

  • POST /rag/create

    • New optional body field: metadata (object). Example:

      {
        "index": "my_idx",
        "name_chunk": "doc1_1",
        "dtype": "FLOAT32",
        "chunk_size": 1024,
        "raw_text": "Chunk text...",
        "embeddings": [0.12, 0.23, ...],
        "embedding_model": "text-embedding-v1",
        "metadata": { "author": "Alice", "language": "EN", "topics": ["RAG","LLM"] }
      }
  • POST /rag/query-rag

    • New optional body field: metadata — server will apply backend-appropriate filtering (Redis field/JSON queries, Qdrant payload filters, PostgreSQL JSONB/SQL filters).

    • Example:

      {
        "index": "my_idx",
        "embeddings": [0.1, 0.2, ...],
        "dtype": "FLOAT32",
        "top_k": 10,
        "cosine_distance_threshold": 0.65,
        "embedding_model": "text-embedding-v1",
        "metadata": { "author": "Alice", "language": "EN" }
      }
  • POST /v1/rerank

    • New endpoint: accepts rerankerjson: [[query, doc_text], ...] and returns list of {query, doc, score} objects (or 202 while loading).

    • Example:

      {
        "rerankerjson": [
          ["What is LLaDA?", "Candidate document text..."],
          ["Explain diffusion", "Another doc text..."]
        ]
      }

Client usage (quick examples)

  • Send with metadata:
client.send_rag(
  embedding_func=my_emb_fn,
  index="my_idx",
  name_chunk="doc",
  raw_text="Very long text ...",
  embedding_model="text-embedding-v1",
  metadata={"author":"Alice", "language":"EN", "topics": list({"RAG"})}
)
  • Query with metadata:
client.query(
  index="my_idx",
  embedding=q_emb,
  top_k=5,
  embedding_model="text-embedding-v1",
  metadata={"language":"EN"}
)
  • Rerank:
resp = client.reranker("Tell me about LLaDA", query_results_or_list_of_docs)

Async client mirrors the same API surface (AsyncAquilesRAG.send_rag, query, reranker).

Migration & Upgrade notes

  1. Config file: re-run the interactive wizard (aquiles-rag configs) to add PostgreSQL or reranker settings to your ~/.local/share/aquiles/aquiles_config.json, or edit the config with the UI.
  2. PostgreSQL prerequisites: if you plan to use PostgreSQL for vectors, ensure your DB has appropriate extensions (e.g. pgvector) and that your schema supports a vector column + JSONB for metadata. Tune pool settings (min_size, max_size, max_queries, timeout) in the wizard or UI.
  3. Reranker behavior: the reranker may be configured to preload into memory (reranker_preload=true) or load async on demand. If you set reranker_preload=false the first call may return 202 Accepted while the model loads in background.
  4. Clients: upgrade your package and update any direct usages to include the new optional metadata parameters where needed. Old calls without metadata remain valid.
  5. Version pin: pip install --upgrade aquiles-rag==0.4.0

Changelog (summary)

  • Added PostgreSQL backend wiring and UI/CLI support.
  • Added metadata field to /rag/create and /rag/query-rag and client methods.
  • Added reranker API and client helpers; added reranker configuration options.
  • Setup Wizard extended to include PostgreSQL and reranker options.
  • Updated client and async client to accept metadata and reranker calls.
  • Updated docs: installation.md, client.md, asynclient.md, api.md, UI configs.

Thanks & Credits

Thanks to everyone for the long wait for these new features. We look forward to continuing to improve Aquiles-RAG to make it the best RAG runtime. Let's keep iterating!

If anything breaks or you need an immediate patch, open an issue (or reference #3) and we’ll triage it quickly.

v0.3.0 - Aquiles-RAG

19 Aug 00:12

Choose a tag to compare

Release v0.3.0 🥳💥

In this release we add official Qdrant support to Aquiles-RAG while maintaining backwards compatibility with Redis. We also introduce an improved setup experience (interactive Setup Wizard), a unified connection layer, and an enhanced deploy pattern.

Many of the commits and changes are described in detail in this issue: #2
Docs for v0.3.0: https://aquiles-ai.github.io/aqRAG-docs/

Highlights

  • 📡 Unified connection layer (Redis & Qdrant)

    • New async helper get_connectionAll() centralizes connection creation for both Redis and Qdrant.
    • A Redis wrapper RdsWr and a Qdrant wrapper QdrantWr encapsulate backend-specific logic (create index, query, send embeddings, drop index). These wrappers let the rest of the codebase call a consistent API regardless of backend.
    • Note: backend-specific differences still exist (e.g., dtype handling and metrics); see API Reference for details.
  • 🧰 Interactive Setup Wizard (CLI)

    • The previous per-flag configuration flow was deprecated in favor of an interactive wizard. Run:
      aquiles-rag configs
    • The wizard prompts for Redis or Qdrant details (host, ports, TLS/gRPC options, API keys, admin user) and writes the final config to:
      ~/.local/share/aquiles/aquiles_config.json
      
    • Automation / CI: to automate setup in scripts/CI, generate the same JSON schema the wizard writes and place it in the path above, or call gen_configs_file() from a small bootstrap script.
  • 👨‍🚀 Deploy pattern (Redis & Qdrant)

    • Introduced DeployConfigRd and DeployConfigQdrant for deployment-time config generation (the old DeployConfig was replaced by these more explicit classes). Use them with gen_configs_file() inside a deploy_*.py script.
    • Example run command:
      # Redis
      aquiles-rag deploy --host "0.0.0.0" --port 5500 --workers 4 deploy_redis.py
      
      # Qdrant
      aquiles-rag deploy --host "0.0.0.0" --port 5500 --workers 4 deploy_qdrant.py
    • The deploy command imports and executes run() from the supplied script, writes aquiles_config.json, and then starts the FastAPI app.
  • 💻 UI Playground: backend-aware live config

    • The UI Playground can edit and display configuration fields for the currently active backend (if Aquiles-RAG is launched with Qdrant you can tune Qdrant settings; if launched with Redis you can tune Redis settings).
    • Important: document which fields apply in-place and which require a restart — see UI docs for the exact list of editable options.

API & Behavior notes

  • Create index: the POST /create/index endpoint now supports creating an index/collection in Redis or Qdrant. Responses differ slightly by backend (Redis returns schema + fields; Qdrant returns success + index name).
  • Ingestion (/rag/create) and Query (/rag/query-rag) now work with both backends; server validates dtype for Redis and forwards float arrays to Qdrant.
  • Monitoring: /status/ram returns Redis memory stats when Redis is used. For Qdrant, Redis-specific memory metrics are not available — the endpoint will return a short explanatory message in redis field.

Developer notes & migration tips

  • Deprecated: manual CLI flags for config. Use the wizard or the deploy pattern for new installs.
  • CI / automation: call gen_configs_file(dp_cfg, force=True) from your bootstrap script to create the runtime config non-interactively.
  • Testing after upgrade:
    1. Run aquiles-rag serve --host "0.0.0.0" --port 5500.
    2. Check readiness: curl http://localhost:5500/health/ready.
    3. Create a quick index and sanity-test ingestion/query (see API Reference).

Breaking / notable changes (callouts)

  • The per-flag CLI configuration flow has been removed/deprecated in favor of the Setup Wizard. If you relied on the old flags in scripts, update to the deploy pattern or write the JSON config file directly.
  • DeployConfig was replaced with DeployConfigRd and DeployConfigQdrant — update any deployment scripts accordingly.

Thanks & where to get help

Thanks to everyone who contributed to the Qdrant integration and the new setup/deploy flow. For details and migration examples see the updated documentation: https://aquiles-ai.github.io/aqRAG-docs/ and the issue tracker (#2).

v0.2.9 - Aquiles-RAG

14 Aug 16:32

Choose a tag to compare

🚀 Release v0.2.9

  • 📝 Changes to the gen_configs_file() function for Aquiles-RAG deployment

    • In the gen_configs_file() function, a new parameter called force of type boolean has been added. If it is activated, it forces the rewriting of the configuration when deploying. By default, it is False.
  • New metadata fields have been added to improve indexing and searching

    • In the /rag/create and /rag/query-rag endpoints a new optional parameter called embedding_model has been added that is used to save metadata of the model that generated the embeddings, improving search and adding a layer of security to retrieve information.
  • ⚖️ License change from MIT to Apache 2.0

    • We wanted a more robust license for everyone that remains as permissive as MIT.
  • ⛏️Configuration loading has been made asynchronous and embedding_model support has been completed on clients

    • load_aquiles_config() has been rewritten as an asynchronous function using aiofiles. This reduces blocking time during startup/queries (internal testing shows performance improvements of up to 3× on the startup/config path).

    • The AsyncAquilesRAG and AquilesRAG clients have added support for embedding model metadata in the query() and send_rag() metadata, including their docstrings.

v0.2.8 - Aquiles - RAG

09 Aug 23:38

Choose a tag to compare

🚀 Release v0.2.8

  • 🛠️ Timeouts have been added to the AsyncAquilesRAG client

    • 🚀 Timeouts have been added to the AsyncAquilesRAG client to improve stability in the use of the asynchronous client, during the implementation of the demo aquiles-chat-demo some errors were found in the asynchronous client timeouts, so adding a more robust timeout improves stability.
  • New methods in the Aquiles client (Synchronous and Asynchronous) called 'drop_index' to interact with the '/rag/drop_index' API

v0.2.7 - Aquiles - RAG

06 Aug 04:37

Choose a tag to compare

🚀 Release v0.2.7

  • 🛠️ Auth fixes in '/docs' and '/redoc' paths

    • 🚀 Due to a configuration error, the '/docs' and '/redoc' paths were still exposed without asking for authentication, but this has been corrected and these paths are now protected.
  • Version and function validator in the 'AsyncAquilesRAG' client

    • A version validator has been added to tell you when aquiles-rag is out of date, and the 'AsyncAquilesRAG' client has been modified to accept both asynchronous and synchronous functions from embeds.

v0.2.6 - Aquiles - RAG

01 Aug 00:32

Choose a tag to compare

🚀 Release v0.2.6

  • 🛠️ Fixes in the deploy command

    • 🚀 Fixes were made to the deploy command to make it more stable and fully compatible with workers so it doesn't kill processes during deployment.
  • 🗑️ Drop index endpoint

    • ➕ Added /rag/drop_index:

      • index_name: str — Name of the Redis index to delete.
      • delete_docs: bool — If true, also removes all documents in the index; if false, only drops the index definition.

      Using these inputs, it calls Redis’s dropindex (with or without documents) and returns the operation result along with the index name.

v0.2.5 - Aquiles - RAG

27 Jul 03:48

Choose a tag to compare

🚀 Release v0.2.5

  • 🛠️ Fix Redis connection handling

    • 🐛 Make it more resilient to failures during intensive use or load testing.
  • New asynchronous client

    • 🚀 Introduced AsyncAquilesRAG, a fully async Aquiles‑RAG client for non‑blocking requests.
  • 🧪 New tests

    • 🔁 In testcg.py, added a load test that multiplies requests by 10 every 2 s to validate endpoint stability.
  • 📊 Status endpoints

    • ➕ Added /status/ram to report Redis memory usage and Aquiles‑RAG CPU/RAM stats.
    • 🖥️ Added /status with an HTML template to visualize those metrics.
  • ⚙️ Update aquiles-rag deploy

    • ➕ Introduced a workers parameter to configure the number of workers for aquiles-rag.

First stable and deployable version

25 Jul 01:45

Choose a tag to compare

Aquiles-RAG 🏷️ v0.2.0

🚀 Highlights

  • Secure API

    • Added API‑key based protection on all RAG endpoints (/create/index, /rag/create, /rag/query-rag).
    • Configurable list of allowed API keys in aquiles_config.json.
  • OAuth‐Protected UI & Docs

    • Username/password login (/token) to access the mini‑UI, Swagger UI (/docs) and ReDoc (/redoc).
    • All UI routes now require a valid access_token cookie.
  • Unified Deployment Workflow

    • DeployConfig + gen_configs_file() to bootstrap aquiles_config.json from a single class.
    • aquiles-rag deploy CLI command dynamically loads your run() from any Python file and then starts Uvicorn.
  • Python SDK Improvements

    • AquilesRAG client now supports sending API‑Key headers automatically.
    • send_rag() chunking & upload matches the new RAG endpoints out‑of‑the‑box.
  • Online Docs Published

📝 Changelog

New Features

  • API Security

    • verify_api_key dependency on /create/index, /rag/create, /rag/query-rag.
    • allows_api_keys and allows_users configurable via aquiles_config.json or CLI.
  • UI & Swagger Protection

    • /login/ui and /token routes to issue JWT in an HTTP‑only cookie.
    • Protected /ui, /docs, /redoc, and OpenAPI JSON behind OAuth.
  • Deployment Config

    • aquiles.deploy_config.DeployConfig extends InitConfigs with JWT_SECRET and ALGORITHM.
    • aquiles-rag deploy --host HOST --port PORT CONFIG_FILE.py
    • Example test_deploy.py for Render, Gen configurations and start server in one command.
  • Client SDK

    • AquilesRAG constructor accepts api_key; all requests now include X-API-Key.
    • Compatibility fixes for create_index, send_rag, and query methods.
  • Documentation

Bug Fixes & Enhancements

  • Fixed race conditions in index‑drop logic when delete_the_index_if_it_exists=true.
  • Improved error handling and status codes on Redis failures (400 vs 500).
  • UI playground now displays live configs, create/index form, RAG ingest form, and query form.
  • Updated chunk_text_by_words() default to ~600 words per chunk for optimal performance.

📦 Installation / Upgrade

pip install --upgrade aquiles-rag==0.2.0

Then, if you’re deploying from source:

git checkout v0.2.0
pip install -r requirements.txt

Many of the changes were proposed in this issue: #1

🔗 Useful Links

First major update

20 Jul 04:24

Choose a tag to compare

What does this update include?

  • Aquiles-RAG has been converted to a minimally functional and installable package.

  • The CLI is now working. For now, we only recommend using the following commands (configuration commands are still under development):

Hello:

fredy@fredy-ProLiant-MicroServer-Gen10:~/projects/Aquiles-RAG$ aquiles-rag hello --name Fredy
Hello, Fredy!

Start the server:

fredy@fredy-ProLiant-MicroServer-Gen10:~/projects/Aquiles-RAG$ aquiles-rag serve
INFO:     Started server process [139767]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:5500 (Press CTRL+C to quit)
INFO:     192.168.1.6:51308 - "GET /ui HTTP/1.1" 200 OK
INFO:     192.168.1.6:51309 - "GET /docs HTTP/1.1" 200 OK
INFO:     192.168.1.6:51309 - "GET /openapi.json HTTP/1.1" 200 OK
  • A mini-UI has been added, which will allow future configuration options from a simple, intuitive, and clean web interface, possibly allowing the option to test endpoints from this mini-UI.

What's missing?

  • Finish developing the functions to configure connections to Redis.

  • Manage search and write to Redis from Aquiles-RAG

  • Design an efficient Redis save and search scheme for most RAG use cases.

  • Complete the mini-UI.

  • Handle runtime errors.

  • Possible expansion to other high-performance databases.

Perhaps further expansions to Aquiles-RAG will be considered later, but for now, we have this as a base.