RAG.WTF is a comprehensive, Flutter-based Retrieval-Augmented Generation (RAG) application built to be open-source, modular, and easy to deploy. It provides developers with full control over their data and infrastructure, demystifying the complexities of building RAG systems by offering a clear, feature-rich starting point.
A GIF demonstrating the full user flow: uploading a document, asking a question, and receiving a sourced answer would be ideal here.
[Screenshot or GIF of the RAG.WTF App UI in Action]
- Why RAG.WTF?
- Key Features
- Architectural Overview
- Project Structure
- Technologies Used
- Getting Started
- Configuration
- Security Considerations
- Scripts
- Testing
- CI/CD & Deployment
- Troubleshooting / FAQ
- Roadmap
- Community & Showcase
- Contributing
- License
We named it RAG.WTF because we wanted to answer the question, "What's The File?" with unprecedented accuracyβand to make the setup process so simple you'll say, "Wow, That's Fast!"
The primary goal of RAG.WTF is to provide a powerful, client-centric RAG solution. By leveraging SurrealDB's WebAssembly (WASM) capabilities, it can run entirely in the browser, offering significant benefits:
- π Enhanced Data Privacy: User data can remain on the client machine, never being sent to a server.
- π° Cost-Effective: Reduces the need for server-side infrastructure, making it highly economical for personal use or prototyping.
- π Offline-First Potential: Lays the groundwork for future offline capabilities.
- π» Client-Side RAG: Runs entirely in the browser using SurrealDB WASM for a secure, serverless-optional experience.
- π¦ Modular Monorepo: Built with Melos, the project separates concerns into distinct packages (chat, document, settings, etc.).
- π€ Multi-Provider LLM Support: Comes pre-configured for various local and cloud-based LLM providers, including Ollama, OpenAI, Anthropic, Gemini, and more.
- βοΈ Comprehensive RAG Settings: A dedicated UI to configure every aspect of the RAG pipeline, from chunking and embedding to retrieval and generation parameters.
- π§± Clean State Management: Utilizes the Stacked architecture with its ViewModel pattern for a clear separation between UI and business logic.
- π CI/CD Ready: Includes GitHub Actions workflows for automated testing and deployment to Netlify.
- π± Cross-Platform: Built with Flutter for seamless deployment across web, mobile, and desktop.
The application follows a standard RAG pipeline, orchestrated across its modular packages.
- Document Ingestion & Processing: A user uploads a document through the UI (packages/document). The document is sent to an external text-splitting service, chunked, and then vector embeddings are generated for each chunk via the configured LLM provider.
- Storage: The text chunks and their corresponding vector embeddings are stored locally or remotely in SurrealDB (packages/database).
- Retrieval & Generation: When a user submits a query in the chat interface (packages/chat), the query is vectorized. A similarity search is performed against the stored vectors to find the most relevant document chunks. These chunks are then combined with the original query into a detailed prompt, which is sent to the LLM to generate a final, context-aware answer.
graph TD
subgraph "UI Layer (Flutter)"
A[π Document Upload UI] --> B{π Document Service};
C[π¬ Chat UI] --> D{π¬ Chat Service};
end
subgraph "Core Logic (Dart Packages)"
B --> E[βοΈ Splitting & Embedding];
E --> F{πΎ Database Service};
D --> G[π Query Vectorization];
G --> F;
F --> H[π Similarity Search];
H --> I{π Prompt Construction};
I --> J{π§ LLM Generation};
end
subgraph "Backend & Services"
K[ποΈ SurrealDB]
L[π€ LLM Provider API]
M[βοΈ Text Splitting Service]
end
B -- Sends File --> M;
F -- Stores/Retrieves Data --> K;
E -- Generates Embeddings --> L;
J -- Generates Response --> L;
J --> C;
The project is a Melos-managed monorepo. Key directories and files include:
βββ .github/ # CI/CD workflows (GitHub Actions, Netlify, Coolify)
βββ packages/
β βββ analytics/ # Event tracking (Mixpanel, Firebase)
β βββ chat/ # Chat UI, state management, and RAG backend communication
β βββ console/ # Console interface for direct system interaction
β βββ database/ # SurrealDB connection management and models
β βββ document/ # File uploads, chunking, and embedding logic
β βββ settings/ # Management of LLM providers and RAG parameters
β βββ ui/ # Shared UI components and widgets
βββ scripts/ # Utility scripts for building, generating code, etc.
βββ lib/ # Main application shell and bootstrapping logic
βββ firebase.json # Firebase configurations for dev, stg, and prod
βββ melos.yaml # Monorepo scripts and workspace definitions
βββ pubspec.yaml # Root project dependencies
- π¦ Framework: Flutter
- π― Language: Dart (>=3.8.1 <4.0.0)
- π¦ Monorepo Management: Melos
- π§± State Management: Stacked
- ποΈ Database: SurrealDB (via surrealdb_wasm)
- π₯ Backend Services: Firebase (for multi-environment config)
- π CI/CD: GitHub Actions
- π Deployment: Netlify and Coolify
- Flutter SDK
- Dart SDK (>=3.8.1 <4.0.0)
- Melos (dart pub global activate melos)
- Docker (for the full local setup)
Setup Type | Use Case | Requirements | Data Privacy | Cost |
---|---|---|---|---|
π Browser-Only | Quick demos, testing UI, using with cloud LLMs. | Browser only. | Depends on cloud LLM provider. | Free (excluding API costs). |
π» Full Local | Offline development, maximum data privacy, no API costs. | Docker, Ollama. | Maximum (data never leaves your machine). | Free. |
This method runs the application using SurrealDB's in-browser storage (IndexedDB), requiring no external database or LLM setup.
-
Clone and Bootstrap:
git clone https://github.com/rag-wtf/app.git cd app melos bootstrap
-
Run the App:
melos run
The melos run command executes the Flutter application in development flavor, using configuration from .env.dev.
This setup provides a complete offline RAG experience using local models via Ollama and a local SurrealDB instance.
-
Clone and Bootstrap (if you haven't already).
-
Set up Ollama:
- Install Ollama.
- Pull the default models required by the application:
ollama pull llama3.2 ollama pull nomic-embed-text
-
Start SurrealDB:
melos start_surreal
This starts a local SurrealDB instance in a Docker container, with data persisted in ~/surrealdb_data.
-
Configure Environment Variables:
- Copy the example environment file: cp .env.example .env.dev
- The default values in .env.dev are pre-configured for this local setup. No changes are needed.
-
Run the App:
melos run
Create .env.dev, .env.stg, and .env.prod files for each environment. Start by copying .env.example and filling in your details.
# .env.example
# Analytics (Optional)
MIXPANEL_PROJECT_TOKEN=your_mixpanel_token
# RAG Services URLs (Default values are for the local Ollama setup)
SPLIT_API_URL=http://localhost:8000/split
EMBEDDINGS_API_URL=http://localhost:11434/v1/embeddings
GENERATION_API_URL=http://localhost:11434/v1/chat/completions
The application's settings UI is dynamically populated from packages/settings/assets/json/llm_providers.json. You can add, remove, or modify LLM providers and their default models in this file.
// Example from packages/settings/assets/json/llm_providers.json
{
"id": "ollama",
"name": "Ollama",
"base_url": "http://localhost:11434/v1",
"website": "https://ollama.com",
"embeddings": {
"model": "nomic-embed-text",
// ... other embedding models
},
"chat_completions": {
"model": "llama3.2",
// ... other chat models
}
}
Text Splitting Service The application relies on an external API endpoint for text splitting, defined by SPLIT_API_URL. The source code for this service is available at github.com/rag-wtf/split. You must run this service locally or deploy it for the document processing to work.
The project uses Firebase for environment separation. To configure your own Firebase projects, run the configuration script for each environment:
./flutterfire-config.sh <environment>
Replace with dev, stg, or prod.
- SurrealDB: For production deployments, ensure your SurrealDB instance is secured with strong root credentials and appropriate network rules to restrict access.
- API Keys: Never hardcode API keys. Use environment variables (.env files) or a secure secrets management solution.
- Input Sanitization: Be mindful of prompt injection risks. While this is a demo application, production systems should implement robust input sanitization and validation.
Key scripts are defined in melos.yaml and can be run with melos <script_name>
:
Script | Description |
---|---|
run | |
bootstrap | π₯ Installs all dependencies across the monorepo. |
generate / generate_packages | βοΈ Runs build_runner for code generation across the project. |
test | β Runs all unit and widget tests for all packages. |
local_integration_test | π§ͺ Runs integration tests locally using Docker and Chrome. |
start_surreal | ποΈ Starts a local SurrealDB instance via Docker. |
build_prod | π¦ Builds the web app for production. |
The project includes a full suite of tests. Run all unit and widget tests with:
melos test
For end-to-end testing, run the local integration tests:
melos local_integration_test
The CI/CD pipeline is managed by GitHub Actions:
- main.yaml: Validates pull requests with spell checks, analysis, and build tests.
- deploy.yaml: Publish changes from the
main
branch to thedevelop
,staging
, andproduction
branches. Thestaging
branch is automatically deployed to Netlify, while theproduction
branch is deployed to Coolify via a manual trigger. Before any deployment, the workflow runs the full integration test suite to ensure application stability and prevent regressions in the production release.
For a self-hosted production deployment, you can deploy the production web build (melos build_prod) to any static host and run a production instance of SurrealDB using Docker on your server. Ensure all environment variables in your .env.prod file point to your production services.
- Melos command fails: Ensure you have activated Melos globally (dart pub global activate melos) and that your Dart SDK's bin directory is in your system's PATH.
- App fails to connect to SurrealDB: Make sure the Docker container is running (docker ps) and that there are no port conflicts on your machine.
- Errors during code generation: Try cleaning the project with melos clean and then re-running melos bootstrap and melos generate.
- CORS errors in browser: When connecting the web app to a local service like Ollama, you may encounter CORS errors. You need to configure Ollama to accept requests from the web app's origin. Refer to the Ollama documentation for instructions on setting the OLLAMA_ORIGINS environment variable.
- Support for additional LLM providers.
- Advanced document processing features (e.g., image and table extraction).
- Enhanced UI/UX for a more intuitive user experience.
- Caching mechanisms to improve performance and reduce API costs.
Have you built something cool with RAG.WTF? We'd love to see it! Please submit a pull request to add your project to a community showcase list here.
Contributions are welcome! We use a fork-and-pull-request workflow. Please see the CONTRIBUTING.md file for detailed instructions on how to set up your development environment and contribute to the project.
We use GitHub Issue Templates for bug reports and feature requests. Please use them to provide clear and actionable information.
This project is licensed under the MIT License. See the LICENSE file for details.