My LLM Playground

Highly configurable API with multiple LLM inference backends and persistent storage.

Start-up
Generation Playground
Assistant Playground
Semantic Search Playground
Tuning Playground
Architecture

1. Start-up

Backend Setup

# Install & activate
> uv sync && source .venv/bin/activate

# Download model
> python scripts/download_model.py transformers google/gemma-3-270m --output-dir ./models/gemma-3-270m

# start backend
> python main.py --config config.yaml

Frontend Setup

> cd frontend
> npm install
> npm run dev

Backend API: http://localhost:8000 | Docs: http://localhost:8000/docs
Frontend UI: http://localhost:5173

Quick Configuration Examples

Development (config.development.yaml):

database:
  url: "sqlite+aiosqlite:///./data/conversations.db"
  echo: true  # SQL query logging

logging:
  level: "DEBUG"
  enable_console: true
  enable_file: true
  json_format: false

Production (config.production.yaml):

database:
  url: "${DATABASE_URL}"
  echo: false
  pool_size: 20

logging:
  level: "INFO" 
  enable_console: false
  enable_file: true
  json_format: true

Backends for Generation

Backend	Best For	Features
Transformers	Research, wide compatibility	Quantization, GPU acceleration
llama.cpp	CPU inference, low memory	GGUF format, hybrid CPU/GPU
vLLM	Production, high throughput	PagedAttention, tensor parallel
Ollama	Easy setup, model management	Built-in downloads, streaming

2. Generation Playground

Text Generation Interface

React UI (TypeScript, Vite, Tailwind CSS)
Text Generation (Text completion interface)
Continue and expand writing with AI assistance

API Examples

Simple Generation

# generation
curl -X POST http://localhost:8000/generate \
  -d '{"prompt": "Hello", "parameters": {"temperature": 0.7}}'

Legacy Endpoint

# Stateless chat
curl -X POST http://localhost:8000/chat \
  -d '{"messages": [{"role": "user", "content": "Hi"}]}'

3. Assistant Playground

Chat Interface

Real-time conversations with message history
Session-based conversations (Persistent SQL database storage)
Conversation Management (List, view, and delete conversation sessions)

Session-Based Chat API

# Start conversation
SESSION_ID=$(curl -X POST http://localhost:8000/conversations/start \
  -H "Content-Type: application/json" \
  -d '{"system_prompt": "You are helpful"}' | jq -r '.session_id')

# Send messages
curl -X POST http://localhost:8000/conversations/$SESSION_ID/message \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!", "parameters": {"max_tokens": 100}}'

# Get history
curl http://localhost:8000/conversations/$SESSION_ID

Backend

Multiple Backends (Transformers, llama.cpp, vLLM, Ollama)
Dual Modes (text completion + assistant chat)
Logging (File rotation, JSON format, configurable levels)
Plug & Play (Config-driven model/backend switching)

4. Semantic Search Playground

Knowledge Base (RAG)

RAG (Retrieval-Augmented Generation) (Semantic search with ChromaDB)
Document Processing (PDF, Word, TXT file upload and chunking)
Semantic search over documents and conversations

RAG API Endpoints

# Upload a document
curl -X POST http://localhost:8000/rag/upload \
  -F "file=@document.pdf"

# Search documents and conversations
curl -X POST http://localhost:8000/rag/search \
  -H "Content-Type: application/json" \
  -d '{"query": "machine learning concepts", "limit": 10}'

# List uploaded documents
curl http://localhost:8000/rag/documents

5. Tuning Playground (TBD)

Current plan

Frontend:

Dataset uploader
Training config selector (LoRA vs QLoRA vs full)
Progress dashboard

Backend:

Job queue (FastAPI background tasks / Celery)
Training pipeline (Hugging Face transformers, peft, accelerate)
Model registry with versioning

Serving:

Dynamically load fine-tuned models
Let user switch between base and tuned versions

Dedicated tab where users can:

Upload a few examples
Choose a tuning method
Preview model behavior before and after

Like giving user a sandbox to test model behavior.

6. Architecture

Backend Structure

├── llm_playground/
│   ├── api/           # FastAPI endpoints
│   ├── backends/      # Transformers, llama.cpp, vLLM, Ollama
│   ├── core/          # Services, SQL conversation management, RAG
│   ├── models/        # Pydantic schemas, SQLAlchemy database models
│   └── config/        # Configuration management
├── scripts/           # Download models, run modes
├── data/              # SQLite database, ChromaDB vector store
└── logs/              # Application & access logs

Frontend Structure

frontend/
├── src/
│   ├── components/    # React components
│   │   ├── chat/     # Chat interface components
│   │   ├── conversations/ # Conversation management
│   │   ├── generation/    # Text generation interface
│   │   ├── layout/        # Layout components (Sidebar)
│   │   ├── rag/          # RAG/Knowledge base components
│   │   ├── settings/      # Settings and system info
│   │   └── ui/           # Reusable UI components
│   ├── lib/          # API client and utilities
│   ├── types/        # TypeScript type definitions
│   └── App.tsx       # Main application component
├── public/           # Static assets
└── package.json      # Dependencies and scripts

Database Configuration

Development (SQLite - Default)

# config.yaml
database:
  url: "sqlite+aiosqlite:///./data/conversations.db"
  echo: false

Environment Variables

export DATABASE_URL="postgresql+asyncpg://user:password@host:5432/db"
python main.py

Production (PostgreSQL)

# config.production.yaml
database:
  url: ${DATABASE_URL}
  pool_size: 20
  max_overflow: 30

Database Storage

SQLite (Development)

Location: ./data/conversations.db (project directory)
Access: sqlite3 ./data/conversations.db
Benefits: Zero setup, portable, version-controllable

PostgreSQL (Production)

Location: External database server or Docker container
Access: Standard PostgreSQL tools (psql, pgAdmin)
Benefits: Concurrent users, ACID transactions, full-text search

Database Schema

-- Conversations table
CREATE TABLE conversations (
    session_id VARCHAR(36) PRIMARY KEY,
    created_at TIMESTAMP NOT NULL,
    last_updated TIMESTAMP NOT NULL,
    system_prompt TEXT,
    model_name VARCHAR(255),
    backend VARCHAR(50),
    last_parameters JSON,
    message_count INTEGER NOT NULL DEFAULT 0
);

-- Messages table
CREATE TABLE messages (
    id SERIAL PRIMARY KEY,
    session_id VARCHAR(36) REFERENCES conversations(session_id),
    role VARCHAR(20) NOT NULL,
    content TEXT NOT NULL,
    timestamp TIMESTAMP NOT NULL,
    sequence_number INTEGER NOT NULL
);

-- RAG Documents table
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    filename VARCHAR(255) NOT NULL,
    content_hash VARCHAR(64) NOT NULL UNIQUE,
    file_size INTEGER NOT NULL,
    document_metadata JSON,
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL
);

-- RAG Document Chunks table
CREATE TABLE document_chunks (
    id SERIAL PRIMARY KEY,
    document_id INTEGER REFERENCES documents(id),
    chunk_index INTEGER NOT NULL,
    content TEXT NOT NULL,
    chunk_metadata JSON,
    created_at TIMESTAMP NOT NULL
);

-- RAG Conversation Chunks table (for searchable chat history)
CREATE TABLE conversation_chunks (
    id SERIAL PRIMARY KEY,
    session_id VARCHAR(36) REFERENCES conversations(session_id),
    chunk_index INTEGER NOT NULL,
    content TEXT NOT NULL,
    chunk_metadata JSON,
    created_at TIMESTAMP NOT NULL
);

Deployment

Option 1: Docker with PostgreSQL

# Use docker-compose for full stack
docker-compose up -d

Option 2: Managed Database

# Use cloud PostgreSQL (AWS RDS, Google Cloud SQL, etc.)
export DATABASE_URL="postgresql+asyncpg://user:pass@db-host:5432/llm_playground"
python main.py --config config.production.yaml

Option 3: SQLite (Small Scale)

# Single-server deployments
python main.py --config config.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
frontend		frontend
llm_playground		llm_playground
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
config.development.yaml		config.development.yaml
config.example.yaml		config.example.yaml
config.json-test.yaml		config.json-test.yaml
config.production.yaml		config.production.yaml
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

arvinsingh/MyLLMPlayground

Folders and files

Latest commit

History

Repository files navigation