Skip to content

A simple Streamlit application that enables persistent conversations with local AI models through Ollama. Your AI remembers previous conversations across sessions using semantic memory.

Notifications You must be signed in to change notification settings

mebrown47/stateful-ai-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

🤖 Stateful AI Chat

A simple Streamlit application that enables persistent conversations with local AI models through Ollama. Your AI remembers previous conversations across sessions using semantic memory.

✨ Features

  • 🧠 Persistent Memory: Conversations are stored and recalled across sessions
  • 🔍 Semantic Search: Retrieves relevant past conversations based on context
  • 🤖 Multi-Model Support: Works with any locally installed Ollama model
  • ⚙️ Configurable: Adjust memory depth, relevance thresholds, and context length
  • 💾 Local Storage: All data stays on your machine using ChromaDB

🚀 Quick Start

Prerequisites

  1. Install Ollama (if not already installed):

    # On macOS/Linux
    curl -fsSL https://ollama.ai/install.sh | sh
    
    # On Windows - download from https://ollama.ai
  2. Install at least one model:

    ollama pull gemma3:12b
    # or any other model you prefer
    ollama pull llama3.2:3b
  3. Start Ollama server:

    ollama serve

Installation

  1. Clone this repository:

    git clone https://github.com/mebrown47/stateful-ai-chat
    cd stateful-ai-chat
  2. Install Python dependencies:

    pip install -r requirements.txt
  3. Run the application:

    streamlit run stateful_ai_chat.py
  4. Open your browser to http://localhost:8501

🎯 Usage

Basic Chat

  1. Select your preferred model from the dropdown
  2. Start chatting - the AI will respond using the selected model
  3. Your conversation history is automatically saved

Memory Features

  • Enable Memory: Toggle to use previous conversations as context
  • Memory Depth: Number of relevant past conversations to include (1-20)
  • Relevance Threshold: Minimum similarity score for memory retrieval (0.1-0.9)
  • Max Context Length: Maximum characters for memory context (300-2000)

Model Management

  • The dropdown automatically shows all your installed Ollama models
  • Switch models anytime - each model has its own memory database
  • If connection fails, you can manually enter a model name

🛠️ Configuration

Memory Settings (Sidebar)

  • Memory Depth: How many relevant memories to retrieve (default: 5)
  • Relevance Threshold: How similar memories need to be (default: 0.3)
  • Context Length: Maximum memory text to include (default: 1000 chars)

Clearing Memory

Use the "🗑️ Clear Memory" button in the sidebar to reset conversation history for the current model.

📁 File Structure

├── stateful_ai_chat.py           # Main application
├── requirements.txt              # Python dependencies
├── README.md                     # This file
└── memory_<model_name>/          # Auto-created memory databases

🔧 Troubleshooting

"Cannot connect to Ollama"

  • Ensure Ollama is running: ollama serve
  • Check if accessible: curl http://localhost:11434/api/tags
  • Verify models are installed: ollama list

"No models found"

  • Install a model: ollama pull <model-name>
  • Popular options: gemma3:12b, llama3.2:3b, qwen2.5-coder:7b

Memory issues

  • ChromaDB databases are created automatically in ./memory_<model>/
  • To reset completely, delete these folders
  • Check disk space if memory operations fail

Performance tips

  • Smaller models respond faster but may be less capable
  • Reduce memory depth for faster responses
  • Increase relevance threshold to get more focused memories

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b new-feature
  3. Commit changes: git commit -am 'Add new feature'
  4. Push to branch: git push origin new-feature
  5. Submit a pull request

📝 License

This project is open source. Feel free to use, modify, and distribute.

🙏 Acknowledgments

  • Built with Streamlit for the web interface
  • Uses Ollama for local AI model inference
  • Memory powered by ChromaDB vector database
  • Embeddings via Ollama's all-minilm model

About

A simple Streamlit application that enables persistent conversations with local AI models through Ollama. Your AI remembers previous conversations across sessions using semantic memory.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages