RAG Documentation System

A Retrieval-Augmented Generation (RAG) system that enhances LLM responses with relevant document context.

Features

Efficient document chunking with context preservation
Markdown-aware processing that maintains document structure
Semantic search using multilingual embeddings
Incremental document updates using content hashing
Integration with Google's Gemini LLM

Setup

Install dependencies:

uv sync

Set up environment variables: You can use any model, along with the correct key. For a list of all models and providers, refer to litellm docs.

GEMINI_API_KEY=your_api_key_here

Place your documentation in the docs/ directory as markdown files.

Usage

Basic Usage

uv run agent.py "Describe the caves of Xylos." # optional -web to add websearch tool to agent.

Command Line Interface

The embedding creation and database management is done through the CLI:

# Add/update documents and run the test query
uv run src/embed.py

# List all stored passages
uv run src/embed.py list

# Clear the database
uv run src/embed.py clear

# Perform a custom search query for testing
uv run src/embed.py "your search query"

Architecture

text_chunker.py: Handles intelligent document splitting
embed.py: Manages document processing and vector database
retriever.py: Implements semantic search functionality
agent.py: Integrates components with LLM using smolagents

Technical Details

Uses Alibaba's multilingual embedding model for semantic search
ChromaDB for vector storage
Google Gemini for LLM responses
Smart chunking preserves document context and header hierarchy
Incremental updates avoid re-embedding unchanged content

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a new Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
chroma_db		chroma_db
docs		docs
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Documentation System

Features

Setup

Usage

Basic Usage

Command Line Interface

Architecture

Technical Details

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

subashc2023/Smol-RAG-Agents

Folders and files

Latest commit

History

Repository files navigation

RAG Documentation System

Features

Setup

Usage

Basic Usage

Command Line Interface

Architecture

Technical Details

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages