Skip to content

A simple web application for real-time AI vision analysis using SmolVLM-500M-Instruct with live camera feed processing and text-to-speech.

Notifications You must be signed in to change notification settings

CasualEngineerZombie/smolvlm-realtime-face

Repository files navigation

🎥 SmolVLM Realtime Vision

A simple web application for real-time AI vision analysis using SmolVLM-500M-Instruct with live camera feed processing and text-to-speech.

🚀 Quick Start

Prerequisites

  • Modern web browser with camera access
  • llama-server with SmolVLM running locally

Setup

  1. Install and run SmolVLM server:

    # Install llama-server and SmolVLM
    # Run the server
    llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF
  2. Serve the application:

    # Using Python
    python -m http.server 8000
    
    # Or open index.html directly in your browser
  3. Access: Open http://localhost:8000 in your browser

💻 Usage

  1. Grant camera permissions when prompted
  2. Configure API endpoint (default: http://localhost:8080)
  3. Set your instruction (e.g., "What do you see?")
  4. Choose request interval (100ms - 2s)
  5. Click Start to begin real-time analysis

🎯 Features

  • Real-time camera feed processing
  • Customizable AI instructions
  • Text-to-speech responses
  • Adjustable analysis intervals
  • Modern, responsive UI

🛡️ Privacy

  • Local processing only
  • No data storage
  • Secure HTTPS connections required
  • User-controlled camera permissions

📝 License

MIT License


Made with ❤️ for the AI community

About

A simple web application for real-time AI vision analysis using SmolVLM-500M-Instruct with live camera feed processing and text-to-speech.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published