Meet Ragstar: Your AI Data Analyst for dbt Projects.

** Previously known as dbt-llm-agent **

Ragstar is an AI agent that learns the context of your dbt project to answer questions, generate documentation, and bring data insights closer to your users. Interact via Slack or CLI, and watch it improve over time with feedback.

BETA NOTICE: Ragstar is currently in beta. Core features include agentic model interpretation and semantic question answering about your dbt project.

Key Value

Democratize Data Access: Allow anyone to ask questions about your dbt project in natural language via Slack or CLI.
Automate Documentation: Generate model and column descriptions where they're missing, improving data catalog quality.
Enhance Data Discovery: Quickly find relevant models and understand their logic without digging through code.
Continuous Learning: Ragstar learns from feedback to provide increasingly accurate and helpful answers.

Key Features

Natural Language Q&A: Ask about models, sources, metrics, lineage, etc.
Agentic Interpretation: Intelligently analyzes dbt models, understanding logic and context.
Automated Documentation Generation: Fills documentation gaps using LLMs.
Semantic Search: Finds relevant assets based on meaning, not just keywords.
dbt Integration: Parses metadata from dbt Cloud, local runs (manifest.json), or source code.
Postgres + pgvector Backend: Stores metadata and embeddings efficiently.
Feedback Loop: Tracks questions and feedback for improvement.
Slack Integration: Built-in Slackbot for easy interaction.

Use Cases

Accelerate Data Discovery: Quickly find relevant dbt models and understand their purpose without digging through code.
Improve Onboarding: Help new team members understand the dbt project structure and logic faster.
Maintain Data Documentation: Keep dbt documentation up-to-date with automated generation and suggestions.
Enhance Data Governance: Gain better visibility into data lineage and model dependencies.
Debug dbt Models: Ask clarifying questions about model logic and calculations.

Architecture

Ragstar combines several technologies to provide its capabilities:

dbt Project Parsing: Extracts comprehensive metadata from dbt artifacts (manifest.json) or source files (.sql, .yml), including models, sources, exposures, metrics, tests, columns, descriptions, and lineage.
PostgreSQL Database with pgvector: Serves as the central knowledge store. It holds structured metadata parsed from the dbt project, generated documentation, question/answer history, and vector embeddings of model and column descriptions for semantic search.
Vector Embeddings: Creates numerical representations (embeddings) of model and column documentation using sentence-transformer models. These embeddings capture semantic meaning, enabling powerful search capabilities.
Large Language Models (LLMs): Integrates with LLMs (e.g., OpenAI's GPT models) via APIs to:
- Understand natural language questions.
- Generate human-readable answers based on retrieved context from the database and embeddings.
- Interpret model logic and generate documentation.
Agentic Reasoning: Employs a step-by-step reasoning process, especially for model interpretation, where it breaks down the task, gathers evidence (e.g., upstream model definitions), and synthesizes an interpretation, similar to how a human analyst would approach it.
CLI Interface: Provides command-line tools (ragstar ...) for initialization, embedding generation, asking questions, providing feedback, and managing the system.

Setup

Setting up Ragstar involves configuring environment variables and initializing the application with your dbt project data. The recommended method is using Docker Compose, which bundles the application and a PostgreSQL database with the required pgvector extension.

Option 1: Docker Compose (Recommended)

Clone the repository:

git clone https://github.com/pragunbhutani/ragstar.git
cd ragstar

Set up environment variables: Rename .env.example to .env and populate it with your specific configurations, such as your OpenAI API key and the APP_HOST (e.g., localhost or your server's IP address).
```
cp .env.example .env
# Open .env and fill in your values
```
Configure Ragstar rules: Rename .ragstarrules.example.yml to .ragstarrules.yml. This file allows you to define custom instructions and behaviors for your RAG application.
```
cp .ragstarrules.example.yml .ragstarrules.yml
# Open .ragstarrules.yml and customize if needed
```
Build and run with Docker Compose: This command will build the Docker images and start the application in detached mode.
```
docker compose up --build -d
```

Run initial Django commands: Execute these commands in the app container to set up the database and create an admin user.

docker compose exec app uv run python manage.py migrate
docker compose exec app uv run python manage.py createsuperuser # Follow prompts to create your admin user

Initialize your project: This command sets up the necessary project configurations. You can choose between cloud, core, or local methods.
```
docker compose exec app uv run python manage.py init_project --method cloud
# Or --method core, or --method local
```
Access the Django Admin: Open your web browser and navigate to http://<your_APP_HOST_value>/admin (e.g., http://localhost/admin if APP_HOST=localhost). Log in with the superuser credentials you created.
Embed Models: In the Django admin interface, you can:
- Navigate to "Models".
- Click on "Interpret".
- Select and embed the models you want to use for answering questions.

Option 2: Local Python Environment (Advanced)

If you prefer not to use Docker, you can set up a local Python environment.

Prerequisites:
- Python 3.10 or higher.
- uv (Python package installer and virtual environment manager). You can install it from https://github.com/astral-sh/uv.
- A running PostgreSQL server (version 11+) with the pgvector extension enabled.
Check Python Version:
```
python --version # or python3 --version
```

Clone Repository:

git clone https://github.com/pragunbhutani/ragstar.git
cd ragstar

Create a virtual environment and install dependencies:

# Create a virtual environment (e.g., named .venv)
python -m venv .venv
# Or using uv: uv venv

# Activate the virtual environment
source .venv/bin/activate # On Windows: .venv\Scripts\activate

# Install dependencies using uv
uv pip install -r requirements.txt
# Or if you have a pyproject.toml configured for uv:
# uv pip install -e .

Set up PostgreSQL:
- Install PostgreSQL and pgvector.
- Create a database (e.g., ragstar_local_dev).
- Enable the pgvector extension in the database:
```
-- Run in psql connected to your database
CREATE EXTENSION IF NOT EXISTS vector;
```
Configure Environment Variables: Copy the example environment file and fill in your details:
```
cp .env.example .env
```
Edit .env:
- Required: Set your OPENAI_API_KEY.
- Required: Set DATABASE_URL to your local PostgreSQL connection string (e.g., postgresql://user:password@host:port/dbname). Ensure this matches the database you created. (The .env.example might have fallback variables like DB_NAME_FALLBACK, etc., which are used if DATABASE_URL is not set; for local setup, explicitly setting DATABASE_URL is clearer).
- Required: Set APP_HOST (e.g., localhost or 127.0.0.1).
- Ragstar Rules: Rename .ragstarrules.example.yml to .ragstarrules.yml and customize if needed.
- Slack (Optional): Configure INTEGRATIONS_SLACK_BOT_TOKEN and INTEGRATIONS_SLACK_SIGNING_SECRET if you plan to use the Slack integration.
- Other: Review other variables like RAGSTAR_LOG_LEVEL, etc., and adjust if needed.
Run Database Migrations: Apply database schema changes:
```
uv run python manage.py migrate
```
Create a Superuser: Create an admin account to access the Django admin interface:
```
uv run python manage.py createsuperuser
# Follow the prompts
```

Initialize your project: This command sets up the necessary project configurations.

uv run python manage.py init_project --method cloud
# Or --method core, or --method local, depending on your dbt project setup.

Run the Development Server:
```
uv run python manage.py runserver
```
The application will typically be available at http://<your_APP_HOST_value>:8000 (e.g., http://localhost:8000).
Access the Django Admin & Embed Models: Follow the same steps as in the Docker setup (steps 7 and 8) to access the admin interface (http://<your_APP_HOST_value>:8000/admin) and embed your models.

Usage

After setup and initialization, you can interact with Ragstar.

Using Docker Compose:

Most Django manage.py commands should be run inside the app container using docker compose exec:

# Example: Run database migrations (if not already done by entrypoint)
docker compose exec app uv run python manage.py migrate

# Example: Create a superuser (if not done during initial setup)
docker compose exec app uv run python manage.py createsuperuser

# Example: Initialize project
docker compose exec app uv run python manage.py init_project --method cloud

The application server is started automatically by docker compose up. Access it via http://<your_APP_HOST_value>/admin.

Using Local Python Environment:

Run Django manage.py commands directly using uv run from your activated virtual environment:

# Example: Run the development server
uv run python manage.py runserver

# Example: Create a superuser
uv run python manage.py createsuperuser

# Example: Initialize project
uv run python manage.py init_project --method cloud

Access the application at http://<your_APP_HOST_value>:8000 and the admin interface at http://<your_APP_HOST_value>:8000/admin.

Core Django Management Commands

The primary way to manage and interact with the application (outside of the web interface) is through Django's manage.py script. Here are some key commands:

uv run python manage.py migrate: Applies database migrations.
uv run python manage.py createsuperuser: Creates an administrator account.
uv run python manage.py init_project --method <cloud|core|local>: Initializes Ragstar with your project data (e.g., from dbt). This is crucial for setting up the knowledge base.
uv run python manage.py runserver [host:port]: Starts the Django development web server.

Other functionalities, such as interpreting and embedding models, are primarily handled through the Django admin interface after logging in.

Slack Integration (Optional)

Ragstar provides a Slack manifest to easily integrate its functionalities into your Slack workspace.

Ensure INTEGRATIONS_SLACK_SIGNING_SECRET and INTEGRATIONS_SLACK_BOT_TOKEN are set in your .env file.
Use the .slack_manifest.example.json file as a template to create a new Slack app.
Follow Slack's documentation for creating an app from a manifest.
This will enable features like asking questions and receiving answers directly within Slack (assuming the Slack integration is running as part of the Django application).

Contributing

Contributions are welcome! Please follow standard fork-and-pull-request workflow.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
apps		apps
docs		docs
ragstar		ragstar
staticfiles		staticfiles
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
.ragstarrules.example.yml		.ragstarrules.example.yml
.ragstarrules.yml		.ragstarrules.yml
.slack_manifest.example.json		.slack_manifest.example.json
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
manage.py		manage.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Meet Ragstar: Your AI Data Analyst for dbt Projects.

Table of Contents

Key Value

Key Features

Use Cases

Architecture

Setup

Option 1: Docker Compose (Recommended)

Option 2: Local Python Environment (Advanced)

Usage

Using Docker Compose:

Using Local Python Environment:

Core Django Management Commands

Slack Integration (Optional)

Contributing

License

About

Uh oh!

Releases 15

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

pragunbhutani/dbt-llm-agent

Folders and files

Latest commit

History

Repository files navigation

Meet Ragstar: Your AI Data Analyst for dbt Projects.

Table of Contents

Key Value

Key Features

Use Cases

Architecture

Setup

Option 1: Docker Compose (Recommended)

Option 2: Local Python Environment (Advanced)

Usage

Using Docker Compose:

Using Local Python Environment:

Core Django Management Commands

Slack Integration (Optional)

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages