LocalRAG

LocalRAG is a terminal-based LLM chat tool with infinite memory through local vector search. It turns your terminal into a Claude/ChatGPT/OpenAI/Gemini/Ollama-style interface with persistent, searchable conversation memory.

Features

✨ Interactive chat with leading OpenAI, Anthropic, Gemini, xAI, and local Ollama models
🧠 Infinite chat memory via local FAISS vectorstore (Retrieval-Augmented Generation)
📂 Save and continue favorite chats at any time (across all models)
🏷️ Automatic, smart conversation titling for easy recall
🔄 Switch models live (config and CLI) including local/proprietary
🖼️ Send images directly in chat with \image <path> (for vision-capable models)
🔑 New, unified config: supports OpenAI/Anthropic/Gemini/xAI/Ollama in one flow
🍃 Local LLM support via Ollama: Run Llama, Gemma, DeepSeek, Phi, and more on your machine!
📜 Expanded model list (localrag models), returns both proprietary and local models with context window size
⭐ Update checker: easily update to the latest version via the CLI
🚫 100% local & privacy-respecting. Your chat memory never leaves your device.

Installation

pip install git+https://github.com/immanuel-peter/localrag.git

Requires Python 3.8 or higher.

Optional for Local LLMs: Install Ollama. It is recommended to pull all of the models you want to use first and then use LocalRAG for a much more rich experience.

Quickstart

1. Configure API Keys and Providers

localrag config

You'll be prompted to set OpenAI, Anthropic, Google Gemini, and xAI API keys. You may also configure Ollama if installed. Also set your default model!

2. Start a Chat (with ANY Supported Model)

localrag run gpt-4.1
localrag run claude-3.7
localrag run gemini-2.5-pro
localrag run llama-4-scout

Use localrag models to see all valid aliases!

3. Chat Commands

Command	Action
`\save`	Save chat as favorite
`\clear`	Clear current chat
`\switch <model>`	Switch LLM/model if no messages sent
`\image <path>`	Attach image to next user message (vision LLMs)
`\quit`	Exit LocalRAG
`\help`	Show available commands

4. View and Continue Saved Chats

localrag saved
localrag saved -c 2

5. Update LocalRAG

localrag update

Persistent Storage

Everything is local:

~/.localrag/
├── chats/             # Individual chat JSON files
├── vector_store.faiss # FAISS index (chat context memory)
├── vector_store.json  # Metadata (chat IDs)
├── config.json        # API keys and default model

Your chat memory never leaves your device.

How It Works

Each message (user and assistant) is embedded via sentence-transformers into a FAISS vector DB
Every new user message is contextually enriched by searching all past chats for relevant history
Context is added to your model prompt (no cloud API sees your full memory)
Smarter, more personalized and contextual conversations—across models/providers
You can use both local and proprietary LLMs in same CLI

Supported Models (Proprietary & Local)

See full live list with:

localrag models

Examples of currently supported:

Proprietary:

gpt-4.1, gpt-4o-mini, o4-mini, o3 (OpenAI)
claude-3.7, claude-3.5 (Anthropic)
gemini-2.5-pro, gemini-2.5-flash, gemini-2.0 (Google)
grok-3 (xAI)

Local/Ollama:

llama-4-scout, llama-4-maverick, llama-3.3 (Meta)
gemma3 (Google), deepseek-r1 (DeepSeek), phi-4-mini (Microsoft), and more!

Contributing

Contributions are very welcome! 🚀

Want to:

Add support for new LLMs/providers/local models?
Improve vector search/RAG logic?
Add slash commands, CLI features, or file support?
Enhance performance or UX?

Please fork, branch, and submit a pull request with your improvements. Keep PRs focused and modular!

License

MIT License. Made by Immanuel Peter.

Future Ideas

Chat with files
Session-based summaries
Custom/plug-in RAG pipelines
Usage of custom models

Stay tuned!

Breaking change: If you have used previous versions, please re-run localrag config to refresh your keys and set up new provider options!

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src/localrag		src/localrag
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LocalRAG

Features

Installation

Quickstart

1. Configure API Keys and Providers

2. Start a Chat (with ANY Supported Model)

3. Chat Commands

4. View and Continue Saved Chats

5. Update LocalRAG

Persistent Storage

How It Works

Supported Models (Proprietary & Local)

Contributing

License

Future Ideas

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

immanuel-peter/localrag

Folders and files

Latest commit

History

Repository files navigation

LocalRAG

Features

Installation

Quickstart

1. Configure API Keys and Providers

2. Start a Chat (with ANY Supported Model)

3. Chat Commands

4. View and Continue Saved Chats

5. Update LocalRAG

Persistent Storage

How It Works

Supported Models (Proprietary & Local)

Contributing

License

Future Ideas

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages