A lightweight Chrome extension that lets you highlight text on any webpage, send it to a locally running Mistral model (via Ollama), and view the response as a tooltip — all without the cloud. Fast, private, and offline.
This project assumes you're using Windows, macOS, or Linux with basic Python and Git knowledge.
- Ollama (to run Mistral locally)
- Python 3.8+ (to run the local API server)
- Chrome (or Edge, Brave, etc.)
- Git
Visit: https://ollama.com/download
Or use the command below on macOS/Linux:
curl -fsSL https://ollama.com/install.sh | sh
On Windows, just run the .exe
installer.
Open a terminal and run:
ollama run mistral
This will:
- Download the Mistral 7B model (~4GB)
- Launch an interactive local chat
- Start a local inference server at
http://localhost:11434
Leave this terminal running.
git clone https://github.com/kcho2027/Mistral-Extension.git
cd Mistral-Extension
pip install flask requests
python mistral_server.py
This exposes a helper API at:
http://localhost:11435/ask
Leave this terminal running too.
chrome://extensions
Select the Mistral-Extension
folder containing:
manifest.json
background.js
icon.png
You should now see the extension installed.
- Open any webpage.
- Highlight some text.
- Right-click → select "Ask Mistral about this text".
- The tooltip will appear in the top-right corner with Mistral's response.
🖱️ The response box supports scrolling and auto-disappears after 15 seconds.
Mistral-Extension/
├── manifest.json # Chrome extension config
├── background.js # Extension logic & tooltip injection
├── icon.png # Extension icon
├── mistral_server.py # Flask API server that connects to Ollama
└── README.md # You're reading this
- ✅ Completely local and offline
- ✅ Uses the powerful Mistral 7B model
- ✅ Runs via Ollama
- ✅ No external API keys or OpenAI account needed
- ✅ Auto-injected scrollable response box
- ✅ Supports any Chromium browser
All your text, prompts, and model responses stay on your device — nothing is sent to any cloud service. Perfect for private research and local AI tooling.
- Show response near cursor instead of top-right
- Keep chat history per session
- Add keyboard shortcut (e.g. Ctrl+M) to trigger prompt
- Add a dismiss (X) button to tooltip
- Automatically resize response box for long outputs
- Add multi-model support via Ollama (
mistral
,llama2
, etc.)
Created by Kyoungbin Cho Pull requests, ideas, and contributions welcome!
MIT License. Free for personal or commercial use.
Let me know if you want help formatting this as a GitHub Pages page, generating preview badges, or adding images!