Skip to content

Lightweight web UI for llama.cpp with dynamic model switching, chat history & markdown support. No GPU required. Perfect for local AI development.

License

Notifications You must be signed in to change notification settings

ukkit/llama-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

llama-chat πŸ¦™

Your lightweight, private, local AI chatbot powered by llama.cpp (no GPU required)

A modern web interface for llama.cpp with markdown rendering, syntax highlighting, and intelligent conversation management. Chat with local LLMs through a sleek, GitHub-inspired interface.

llama.cpp Chat Interface Python llama.cpp License

✨ Features

  • πŸ€– llama.cpp Integration - Direct integration with llama.cpp server for optimal performance
  • πŸ”„ Dynamic Model Switching - Switch between models without restarting services
  • πŸ’¬ Multiple Conversations - Create, manage, and rename chat sessions
  • πŸ“š Persistent History - SQLite database storage with search functionality
  • πŸš€ Lightweight - Minimal resource usage, runs on CPU-only systems
  • πŸ“ Full Markdown Rendering - GitHub-flavored syntax with code highlighting
  • ⚑ Performance Metrics - Real-time response times, token tracking, and speed analytics
  • πŸ₯ Health Monitoring - Automatic service monitoring and restart capabilities

πŸš€ Quick Start

Prerequisites

⚠️ Before installing llama-chat, you need to have llama.cpp installed on your system ⚠️

Install llama.cpp:

# Option 1: Build via llama_cpp_setup.sh ((recommended)
curl -fsSL https://github.com/ukkit/llama-chat/raw/main/llama_cpp_setup.sh | bash
Other installation options
# Option 2:Build from source
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release

# Option 3: Install via package manager (if available)
# Ubuntu/Debian:
# apt install llama.cpp

# macOS:
# brew install llama.cpp

⚠️ Make sure llama-server is in your PATH ⚠️

which llama-server  # Should show the path to llama-server

30-Second Quick Start

For most users (auto-install):

curl -fsSL https://github.com/ukkit/llama-chat/raw/main/install.sh | bash

What the install script does:

  • βœ… Sets up Python virtual environment
  • βœ… Downloads recommended model (~400MB)
  • βœ… Installs llama-chat with Flask frontend
  • βœ… Creates configuration files
  • βœ… Starts both llama.cpp server and web interface

Access at: http://localhost:3333

πŸ”§ Manual Installation

For detailed manual installation steps:

# Prerequisites: Python 3.8+, llama.cpp installed, and at least one .gguf model
git clone https://github.com/ukkit/llama-chat.git
cd llama-chat
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Download a model (optional - you can add your own)
./chat-manager.sh download-model \
  "https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q4_0.gguf" \
  "qwen2.5-0.5b-instruct-q4_0.gguf"

# Start services
./chat-manager.sh start

πŸ“Έ Screenshots

πŸ“· App Screenshots

llama-chat - Main Interface Main interface

llama-chat - Chat Interface Chat Interface

llama-chat - Model Selection Select Models from Dropdown

llama-chat - Switch Model Model Switch

llama-chat - Switch Model via Chat Selection Switch Model by Selecting existin Chat

llama-chat - Model Switched Model Switching complete

llama-chat - Markdown Support Full Markdown rendering

Configuration Files

File Purpose
cm.conf Main chat-manager configuration (ports, performance, model settings)
config.json Model parameters, timeouts, system prompt
docs/detailed_cm.conf Config file with more configuration options for llama-chat and llama.cpp server

See docs/config.md for complete configuration options.

πŸ”§ Enhanced Management Commands

llama-chat includes a comprehensive management script with enhanced features:

Core Operations

# Basic operations
./chat-manager.sh start              # Start all services (llama.cpp + Flask + monitor)
./chat-manager.sh stop               # Stop all services
./chat-manager.sh restart            # Restart all services
./chat-manager.sh status             # Show detailed service status and health

See docs/chat-manager.md for detailed operations

πŸ€– Supported Models

llama-chat works with any .gguf format model. Here are some popular options:

Recommended Starter Models

# Fast, lightweight (400MB) - Great for testing
./chat-manager.sh download-model \
  "https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q4_0.gguf" \
  "qwen2.5-0.5b-instruct-q4_0.gguf"
# Compact, good performance (1.3GB)
./chat-manager.sh download-model \
  "https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_K_M.gguf" \
  "llama3.2-1b-instruct-q4.gguf"

Model Categories

  • Ultra-fast: tinyllama, qwen2.5:0.5b (good for testing)
  • Balanced: phi3-mini, llama3.2:1b (daily use)
  • High-quality: llama3.1:8b, qwen2.5:7b (when you have RAM)
  • Specialized: codellama, mistral-nemo (coding, specific tasks)

Dynamic Model Switching

Switch between models without restarting services:

# Switch to a different model
./chat-manager.sh switch-model phi3-mini-4k-instruct-q4.gguf

# Check current model
./chat-manager.sh status

# List available models
./chat-manager.sh list-models

πŸ”§ Need Help

Issue Solution
llama.cpp not found Install llama.cpp and ensure llama-server is in PATH
Port in use ./chat-manager.sh force-cleanup
No models ./chat-manager.sh download-model <url> <file>
Process stuck ./chat-manager.sh force-cleanup
Slow responses Use smaller model or adjust GPU_LAYERS
Memory issues Reduce context size in cm.conf
Model switching fails Check model file exists: ./chat-manager.sh list-models
Services won't start Check health: ./chat-manager.sh test

Common Installation Issues

Problem Cause Solution
llama-server not found llama.cpp not installed Install llama.cpp from source or package manager
Permission denied Executable permissions missing chmod +x chat-manager.sh
Port conflicts Services already running ./chat-manager.sh force-cleanup
Python module errors Virtual environment issues Re-run setup: ./chat-manager.sh setup-venv
Model loading fails Corrupted or wrong format Re-download model

See docs/troubleshooting.md for comprehensive troubleshooting.

βœ”οΈ Tested Platforms

Platform CPU RAM llama.cpp Status Notes
Ubuntu 20.04+ x86_64 8GB+ Source/Package βœ… Excellent Primary development platform
Windows 11 x86_64 8GB+ WSL2/Source βœ… Good WSL2 recommended
Debian 12+ x86_64 8GB+ Source/Package βœ… Excellent Server deployments

πŸ“š Documentation

Document Description
Installation Guide Complete installation instructions
Configuration Guide Detailed configuration options
API Documentation REST API reference with examples
Troubleshooting Common issues and solutions
Management Script chat-manager.sh documentation
Models Model recommendations and setup

πŸ™ Acknowledgments

Made with ❀️ for the AI community

⭐ Star this project if you find it helpful!


MIT License - see LICENSE file.