An offline AI assistant that runs entirely in your browser using WebLLM and WebGPU! β¨
This application allows you to chat with LLMs directly in your browser without sending data to external servers. All processing happens locally on your device.
- π A modern browser with WebGPU support:
- Chrome 113+
- Edge 113+
- Firefox 118+
- π» A device with sufficient GPU capabilities
- πΎ Approximately 1-4GB of storage space (depending on model size)
- Simply open the
index.html
file in a supported browser - Select a model from the dropdown menu
- Click "Load Model" and wait for the download to complete
- Start chatting with the AI! π¬
This app uses:
- π§ WebLLM library to run models in the browser
- β‘ WebGPU for hardware acceleration
- ποΈ Quantized models for efficient performance
- π Basic HTML, CSS, and JavaScript (no framework dependencies)
- SmolLM2 360M: π₯ A very small model, great for basic tasks
- Llama 3.1 8B: π¦ Medium-sized model with good capabilities
- Phi 3.5 Mini: π¦ Larger model with enhanced response quality
Feel free to modify the app to suit your needs. The entire application is contained in a single HTML file for simplicity.
This project is open source and available under the MIT License.
For a more feature-rich implementation with additional functionality, check out:
- WebLLM Offline AI Assistant - A more advanced version with:
- π₯οΈ PC-themed desktop interface
- π¬ Chat history support
- ποΈ IndexedDB caching
- π Logger
- π±οΈ Draggable windows
- π½ Taskbar and window controls
- π± Responsive design for mobile and desktop
β¨ Live demo: chat.ebenezerdon.com