Releases: Aquiles-ai/Aquiles-RAG
v0.4.0 - Aquiles-RAG
Release v0.4.0 🥳💥 - Official PostgreSQL support, send and query with metadata for Redis, Qdrant and PostgreSQL and integration of a Reranker API
In this release, official support for PostgreSQL has been added as a backend, support for sending and querying with metadata to the RAG has been added (optional), and a Reranker API was introduced — a fast, model-based second-stage ranker (Fastembed runtime + server-side configuration).
Many of the commits and changes are described in detail in this issue: #3
Docs for v0.4.0: https://aquiles-ai.github.io/aqRAG-docs/
Highlights
-
✅ PostgreSQL backend (first-class):
- Full support to persist vectors and metadata in PostgreSQL (e.g. via
pgvector
+JSONB
for metadata). - Connection pool configuration supported (min/max pool size, max queries, timeout).
- UI reflects PostgreSQL connection fields and indices/tables.
- Full support to persist vectors and metadata in PostgreSQL (e.g. via
-
✅ Metadata support for
send
andquery
:metadata
can be supplied when ingesting chunks (/rag/create
) and as a filter when searching (/rag/query-rag
).- Allowed metadata keys:
author
,language
,topics
,source
,created_at
,extra
. - Backends persist and/or filter metadata according to their capabilities (Redis fields/JSON, Qdrant payloads, PostgreSQL JSONB).
-
✅ Reranker API (
/v1/rerank
):- New endpoint for second-stage scoring of
(query, doc_text)
pairs. - Server-side configuration options exposed:
rerank
,provider_re
,reranker_model
,max_concurrent_request
,reranker_preload
. - Supports lazy load (background) and preload modes; responds
202 Accepted
while loading asynchronously when configured.
- New endpoint for second-stage scoring of
-
✅ Setup Wizard & UI improvements:
- Interactive
aquiles-rag configs
wizard now supports PostgreSQL in addition to Redis and Qdrant. - Wizard collects reranker options and persists them in the standard config.
/ui/configs
and/ui/configs
POST accept and show PostgreSQL-specific fields (pool & connection settings).
- Interactive
-
✅ Client libraries updated:
AquilesRAG.query(...)
andsend_rag(...)
now acceptmetadata
andembedding_model
parameters.AquilesRAG.reranker(...)
added — client helper that calls/v1/rerank
.- Async client (
AsyncAquilesRAG
) mirrors the same new parameters andreranker
method.
-
✅ Docs & examples updated across
installation.md
,client.md
,asynclient.md
, andapi.md
to reflect new inputs, responses and behavior.
API changes (backwards-compatible)
-
POST /rag/create
-
New optional body field:
metadata
(object). Example:{ "index": "my_idx", "name_chunk": "doc1_1", "dtype": "FLOAT32", "chunk_size": 1024, "raw_text": "Chunk text...", "embeddings": [0.12, 0.23, ...], "embedding_model": "text-embedding-v1", "metadata": { "author": "Alice", "language": "EN", "topics": ["RAG","LLM"] } }
-
-
POST /rag/query-rag
-
New optional body field:
metadata
— server will apply backend-appropriate filtering (Redis field/JSON queries, Qdrant payload filters, PostgreSQL JSONB/SQL filters). -
Example:
{ "index": "my_idx", "embeddings": [0.1, 0.2, ...], "dtype": "FLOAT32", "top_k": 10, "cosine_distance_threshold": 0.65, "embedding_model": "text-embedding-v1", "metadata": { "author": "Alice", "language": "EN" } }
-
-
POST /v1/rerank
-
New endpoint: accepts
rerankerjson: [[query, doc_text], ...]
and returns list of{query, doc, score}
objects (or202
while loading). -
Example:
{ "rerankerjson": [ ["What is LLaDA?", "Candidate document text..."], ["Explain diffusion", "Another doc text..."] ] }
-
Client usage (quick examples)
- Send with metadata:
client.send_rag(
embedding_func=my_emb_fn,
index="my_idx",
name_chunk="doc",
raw_text="Very long text ...",
embedding_model="text-embedding-v1",
metadata={"author":"Alice", "language":"EN", "topics": list({"RAG"})}
)
- Query with metadata:
client.query(
index="my_idx",
embedding=q_emb,
top_k=5,
embedding_model="text-embedding-v1",
metadata={"language":"EN"}
)
- Rerank:
resp = client.reranker("Tell me about LLaDA", query_results_or_list_of_docs)
Async client mirrors the same API surface (AsyncAquilesRAG.send_rag
, query
, reranker
).
Migration & Upgrade notes
- Config file: re-run the interactive wizard (
aquiles-rag configs
) to add PostgreSQL or reranker settings to your~/.local/share/aquiles/aquiles_config.json
, or edit the config with the UI. - PostgreSQL prerequisites: if you plan to use PostgreSQL for vectors, ensure your DB has appropriate extensions (e.g.
pgvector
) and that your schema supports a vector column +JSONB
for metadata. Tune pool settings (min_size
,max_size
,max_queries
,timeout
) in the wizard or UI. - Reranker behavior: the reranker may be configured to preload into memory (
reranker_preload=true
) or load async on demand. If you setreranker_preload=false
the first call may return202 Accepted
while the model loads in background. - Clients: upgrade your package and update any direct usages to include the new optional
metadata
parameters where needed. Old calls without metadata remain valid. - Version pin:
pip install --upgrade aquiles-rag==0.4.0
Changelog (summary)
- Added PostgreSQL backend wiring and UI/CLI support.
- Added
metadata
field to/rag/create
and/rag/query-rag
and client methods. - Added reranker API and client helpers; added reranker configuration options.
- Setup Wizard extended to include PostgreSQL and reranker options.
- Updated client and async client to accept metadata and reranker calls.
- Updated docs:
installation.md
,client.md
,asynclient.md
,api.md
, UI configs.
Thanks & Credits
Thanks to everyone for the long wait for these new features. We look forward to continuing to improve Aquiles-RAG to make it the best RAG runtime. Let's keep iterating!
If anything breaks or you need an immediate patch, open an issue (or reference #3) and we’ll triage it quickly.
v0.3.0 - Aquiles-RAG
Release v0.3.0 🥳💥
In this release we add official Qdrant support to Aquiles-RAG while maintaining backwards compatibility with Redis. We also introduce an improved setup experience (interactive Setup Wizard), a unified connection layer, and an enhanced deploy pattern.
Many of the commits and changes are described in detail in this issue: #2
Docs for v0.3.0: https://aquiles-ai.github.io/aqRAG-docs/
Highlights
-
📡 Unified connection layer (Redis & Qdrant)
- New async helper
get_connectionAll()
centralizes connection creation for both Redis and Qdrant. - A Redis wrapper
RdsWr
and a Qdrant wrapperQdrantWr
encapsulate backend-specific logic (create index, query, send embeddings, drop index). These wrappers let the rest of the codebase call a consistent API regardless of backend. - Note: backend-specific differences still exist (e.g., dtype handling and metrics); see API Reference for details.
- New async helper
-
🧰 Interactive Setup Wizard (CLI)
- The previous per-flag configuration flow was deprecated in favor of an interactive wizard. Run:
aquiles-rag configs
- The wizard prompts for Redis or Qdrant details (host, ports, TLS/gRPC options, API keys, admin user) and writes the final config to:
~/.local/share/aquiles/aquiles_config.json
- Automation / CI: to automate setup in scripts/CI, generate the same JSON schema the wizard writes and place it in the path above, or call
gen_configs_file()
from a small bootstrap script.
- The previous per-flag configuration flow was deprecated in favor of an interactive wizard. Run:
-
👨🚀 Deploy pattern (Redis & Qdrant)
- Introduced
DeployConfigRd
andDeployConfigQdrant
for deployment-time config generation (the oldDeployConfig
was replaced by these more explicit classes). Use them withgen_configs_file()
inside adeploy_*.py
script. - Example run command:
# Redis aquiles-rag deploy --host "0.0.0.0" --port 5500 --workers 4 deploy_redis.py # Qdrant aquiles-rag deploy --host "0.0.0.0" --port 5500 --workers 4 deploy_qdrant.py
- The
deploy
command imports and executesrun()
from the supplied script, writesaquiles_config.json
, and then starts the FastAPI app.
- Introduced
-
💻 UI Playground: backend-aware live config
- The UI Playground can edit and display configuration fields for the currently active backend (if Aquiles-RAG is launched with Qdrant you can tune Qdrant settings; if launched with Redis you can tune Redis settings).
- Important: document which fields apply in-place and which require a restart — see UI docs for the exact list of editable options.
API & Behavior notes
- Create index: the
POST /create/index
endpoint now supports creating an index/collection in Redis or Qdrant. Responses differ slightly by backend (Redis returns schema + fields; Qdrant returns success + index name). - Ingestion (
/rag/create
) and Query (/rag/query-rag
) now work with both backends; server validatesdtype
for Redis and forwards float arrays to Qdrant. - Monitoring:
/status/ram
returns Redis memory stats when Redis is used. For Qdrant, Redis-specific memory metrics are not available — the endpoint will return a short explanatory message inredis
field.
Developer notes & migration tips
- Deprecated: manual CLI flags for config. Use the wizard or the deploy pattern for new installs.
- CI / automation: call
gen_configs_file(dp_cfg, force=True)
from your bootstrap script to create the runtime config non-interactively. - Testing after upgrade:
- Run
aquiles-rag serve --host "0.0.0.0" --port 5500
. - Check readiness:
curl http://localhost:5500/health/ready
. - Create a quick index and sanity-test ingestion/query (see API Reference).
- Run
Breaking / notable changes (callouts)
- The per-flag CLI configuration flow has been removed/deprecated in favor of the Setup Wizard. If you relied on the old flags in scripts, update to the deploy pattern or write the JSON config file directly.
DeployConfig
was replaced withDeployConfigRd
andDeployConfigQdrant
— update any deployment scripts accordingly.
Thanks & where to get help
Thanks to everyone who contributed to the Qdrant integration and the new setup/deploy flow. For details and migration examples see the updated documentation: https://aquiles-ai.github.io/aqRAG-docs/ and the issue tracker (#2).
v0.2.9 - Aquiles-RAG
🚀 Release v0.2.9
-
📝 Changes to the gen_configs_file() function for Aquiles-RAG deployment
- In the
gen_configs_file()
function, a new parameter calledforce
of type boolean has been added. If it is activated, it forces the rewriting of the configuration when deploying. By default, it is False.
- In the
-
➕ New metadata fields have been added to improve indexing and searching
- In the
/rag/create
and/rag/query-rag
endpoints a new optional parameter calledembedding_model
has been added that is used to save metadata of the model that generated the embeddings, improving search and adding a layer of security to retrieve information.
- In the
-
⚖️ License change from MIT to Apache 2.0
- We wanted a more robust license for everyone that remains as permissive as MIT.
-
⛏️Configuration loading has been made asynchronous and embedding_model support has been completed on clients
-
load_aquiles_config()
has been rewritten as an asynchronous function usingaiofiles
. This reduces blocking time during startup/queries (internal testing shows performance improvements of up to 3× on the startup/config path). -
The
AsyncAquilesRAG
andAquilesRAG
clients have added support for embedding model metadata in thequery()
andsend_rag()
metadata, including their docstrings.
-
v0.2.8 - Aquiles - RAG
🚀 Release v0.2.8
-
🛠️ Timeouts have been added to the AsyncAquilesRAG client
- 🚀 Timeouts have been added to the AsyncAquilesRAG client to improve stability in the use of the asynchronous client, during the implementation of the demo aquiles-chat-demo some errors were found in the asynchronous client timeouts, so adding a more robust timeout improves stability.
-
➕ New methods in the Aquiles client (Synchronous and Asynchronous) called 'drop_index' to interact with the '/rag/drop_index' API
v0.2.7 - Aquiles - RAG
🚀 Release v0.2.7
-
🛠️ Auth fixes in '/docs' and '/redoc' paths
- 🚀 Due to a configuration error, the '/docs' and '/redoc' paths were still exposed without asking for authentication, but this has been corrected and these paths are now protected.
-
➕ Version and function validator in the 'AsyncAquilesRAG' client
- A version validator has been added to tell you when aquiles-rag is out of date, and the 'AsyncAquilesRAG' client has been modified to accept both asynchronous and synchronous functions from embeds.
v0.2.6 - Aquiles - RAG
🚀 Release v0.2.6
-
🛠️ Fixes in the deploy command
- 🚀 Fixes were made to the deploy command to make it more stable and fully compatible with workers so it doesn't kill processes during deployment.
-
🗑️ Drop index endpoint
-
➕ Added
/rag/drop_index
:index_name: str
— Name of the Redis index to delete.delete_docs: bool
— Iftrue
, also removes all documents in the index; iffalse
, only drops the index definition.
Using these inputs, it calls Redis’s
dropindex
(with or without documents) and returns the operation result along with the index name.
-
v0.2.5 - Aquiles - RAG
🚀 Release v0.2.5
-
🛠️ Fix Redis connection handling
- 🐛 Make it more resilient to failures during intensive use or load testing.
-
✨ New asynchronous client
- 🚀 Introduced
AsyncAquilesRAG
, a fully async Aquiles‑RAG client for non‑blocking requests.
- 🚀 Introduced
-
🧪 New tests
- 🔁 In
testcg.py
, added a load test that multiplies requests by 10 every 2 s to validate endpoint stability.
- 🔁 In
-
📊 Status endpoints
- ➕ Added
/status/ram
to report Redis memory usage and Aquiles‑RAG CPU/RAM stats. - 🖥️ Added
/status
with an HTML template to visualize those metrics.
- ➕ Added
-
⚙️ Update
aquiles-rag deploy
- ➕ Introduced a
workers
parameter to configure the number of workers foraquiles-rag
.
- ➕ Introduced a
First stable and deployable version
Aquiles-RAG 🏷️ v0.2.0
🚀 Highlights
-
Secure API
- Added API‑key based protection on all RAG endpoints (
/create/index
,/rag/create
,/rag/query-rag
). - Configurable list of allowed API keys in
aquiles_config.json
.
- Added API‑key based protection on all RAG endpoints (
-
OAuth‐Protected UI & Docs
- Username/password login (
/token
) to access the mini‑UI, Swagger UI (/docs
) and ReDoc (/redoc
). - All UI routes now require a valid
access_token
cookie.
- Username/password login (
-
Unified Deployment Workflow
DeployConfig
+gen_configs_file()
to bootstrapaquiles_config.json
from a single class.aquiles-rag deploy
CLI command dynamically loads yourrun()
from any Python file and then starts Uvicorn.
-
Python SDK Improvements
AquilesRAG
client now supports sending API‑Key headers automatically.send_rag()
chunking & upload matches the new RAG endpoints out‑of‑the‑box.
-
Online Docs Published
- Full documentation site now live → https://aquiles-ai.github.io/aqRAG-docs/
📝 Changelog
New Features
-
API Security
verify_api_key
dependency on/create/index
,/rag/create
,/rag/query-rag
.allows_api_keys
andallows_users
configurable viaaquiles_config.json
or CLI.
-
UI & Swagger Protection
/login/ui
and/token
routes to issue JWT in an HTTP‑only cookie.- Protected
/ui
,/docs
,/redoc
, and OpenAPI JSON behind OAuth.
-
Deployment Config
aquiles.deploy_config.DeployConfig
extendsInitConfigs
withJWT_SECRET
andALGORITHM
.aquiles-rag deploy --host HOST --port PORT CONFIG_FILE.py
- Example
test_deploy.py
for Render, Gen configurations and start server in one command.
-
Client SDK
AquilesRAG
constructor acceptsapi_key
; all requests now includeX-API-Key
.- Compatibility fixes for
create_index
,send_rag
, andquery
methods.
-
Documentation
- Published at https://aquiles-ai.github.io/aqRAG-docs/
Bug Fixes & Enhancements
- Fixed race conditions in index‑drop logic when
delete_the_index_if_it_exists=true
. - Improved error handling and status codes on Redis failures (400 vs 500).
- UI playground now displays live configs, create/index form, RAG ingest form, and query form.
- Updated
chunk_text_by_words()
default to ~600 words per chunk for optimal performance.
📦 Installation / Upgrade
pip install --upgrade aquiles-rag==0.2.0
Then, if you’re deploying from source:
git checkout v0.2.0
pip install -r requirements.txt
Many of the changes were proposed in this issue: #1
🔗 Useful Links
- 📖 Online Docs: https://aquiles-ai.github.io/aqRAG-docs/
- 🐛 Report issues: https://github.com/Aquiles‑ai/Aquiles-RAG/issues
First major update
What does this update include?
-
Aquiles-RAG has been converted to a minimally functional and installable package.
-
The CLI is now working. For now, we only recommend using the following commands (configuration commands are still under development):
Hello:
fredy@fredy-ProLiant-MicroServer-Gen10:~/projects/Aquiles-RAG$ aquiles-rag hello --name Fredy
Hello, Fredy!
Start the server:
fredy@fredy-ProLiant-MicroServer-Gen10:~/projects/Aquiles-RAG$ aquiles-rag serve
INFO: Started server process [139767]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:5500 (Press CTRL+C to quit)
INFO: 192.168.1.6:51308 - "GET /ui HTTP/1.1" 200 OK
INFO: 192.168.1.6:51309 - "GET /docs HTTP/1.1" 200 OK
INFO: 192.168.1.6:51309 - "GET /openapi.json HTTP/1.1" 200 OK
- A mini-UI has been added, which will allow future configuration options from a simple, intuitive, and clean web interface, possibly allowing the option to test endpoints from this mini-UI.
What's missing?
-
Finish developing the functions to configure connections to Redis.
-
Manage search and write to Redis from Aquiles-RAG
-
Design an efficient Redis save and search scheme for most RAG use cases.
-
Complete the mini-UI.
-
Handle runtime errors.
-
Possible expansion to other high-performance databases.
Perhaps further expansions to Aquiles-RAG will be considered later, but for now, we have this as a base.