Skip to content

mishaneta/vision-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ₯ Real-time Video Analysis

A modern web application that provides real-time object detection and scene analysis for uploaded videos using YOLOv8 and computer vision.

Demo

✨ Features

  • 🎬 Auto-Play Analysis - Videos start automatically with live object detection
  • πŸ€– Real-time Detection - YOLOv8-powered object recognition at 2fps
  • πŸ“ Scene Descriptions - Natural language descriptions of detected scenes
  • 🎯 Bounding Box Overlay - Visual detection results overlaid on video
  • πŸ’¬ Live Commentary - YouTube-style comment feed with timestamps
  • πŸ”„ Synchronized Playback - Analysis syncs with video play/pause state
  • πŸ“± Responsive Design - Works on desktop and mobile devices
  • ⚑ Fast Processing - Optimized for real-time performance

πŸ› οΈ Tech Stack

Backend

  • FastAPI - Modern async web framework
  • YOLOv8 (Ultralytics) - State-of-the-art object detection
  • OpenCV - Computer vision and video processing
  • PyTorch - Deep learning framework
  • NumPy - Numerical computing

Frontend

  • Next.js 14 - React framework with App Router
  • TypeScript - Type-safe development
  • Tailwind CSS - Utility-first styling
  • HTML5 Canvas - Frame capture and overlay rendering

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Node.js 16+
  • npm or yarn

Backend Setup

  1. Clone the repository
git clone https://github.com/yourusername/realtime-video-analysis.git
cd realtime-video-analysis
  1. Install Python dependencies
pip install fastapi uvicorn ultralytics opencv-python torch numpy
  1. Run the backend server
python main.py

The API will be available at http://localhost:8000

Frontend Setup

  1. Navigate to frontend directory
cd frontend  # or wherever your Next.js app is located
  1. Install dependencies
npm install
# or
yarn install
  1. Set environment variables
# Create .env.local
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000
  1. Run the development server
npm run dev
# or
yarn dev

The app will be available at http://localhost:3000

πŸ“– Usage

  1. Upload Video: Drag & drop or select a video file (MP4, MOV, AVI, WebM)
  2. Automatic Analysis: Video auto-plays with real-time object detection
  3. View Results: See live detection messages and bounding boxes
  4. Navigate: Click timestamps to jump to specific moments

Supported Video Formats

  • MP4
  • MOV
  • AVI
  • WebM

πŸ”Œ API Documentation

Upload Video

POST /upload-video
Content-Type: multipart/form-data

Form Data:
- file: video file

Analyze Frame

POST /analyze-frame
Content-Type: application/json

{
  "frame_data": "base64_encoded_image",
  "timestamp": 12.34
}

Stream Video

GET /video/{video_id}

Health Check

GET /health

πŸ“ Project Structure

realtime-video-analysis/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py              # FastAPI application
β”‚   β”œβ”€β”€ analyzer.py          # YOLOv8 analysis engine
β”‚   └── requirements.txt     # Python dependencies
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ page.tsx         # Main application page
β”‚   β”‚   └── components/
β”‚   β”‚       β”œβ”€β”€ VideoUploadPlayer.tsx
β”‚   β”‚       └── DetectionMessages.tsx
β”‚   β”œβ”€β”€ package.json
β”‚   └── tailwind.config.js
└── README.md

🎯 Key Components

Backend Components

  • VisionAnalyzer - Core YOLOv8 analysis engine
  • FastAPI Routes - REST API endpoints
  • Frame Analysis - Real-time video frame processing
  • Scene Description - Natural language generation

Frontend Components

  • VideoUploadPlayer - Video upload and playback with analysis
  • DetectionMessages - Real-time message feed
  • Canvas Overlay - Bounding box visualization

βš™οΈ Configuration

Backend Configuration

  • Analysis Rate: 2fps (500ms intervals)
  • Model: YOLOv8 nano (yolov8n.pt) for speed
  • Video Storage: Temporary session-based storage

Frontend Configuration

  • Auto-play: Enabled by default
  • CORS: Configured for cross-origin video access
  • Layout: Fixed-height responsive design

πŸ› Debugging

The application includes comprehensive logging:

Backend Logs

πŸ” === FRAME ANALYSIS START ===
πŸ“ Frame dimensions: (720, 1280, 3)
🎯 Total detections: 3
πŸ“ === GENERATING SCENE DESCRIPTION ===

Frontend Logs (Browser Console)

🎯 analyzeFrame called
πŸ“Ή Video ref: true Canvas ref: true
🌐 Sending analysis request
βœ… Analysis successful

πŸ”§ Development

Running in Development Mode

# Backend with auto-reload
uvicorn main:app --reload --host 0.0.0.0 --port 8000

# Frontend with hot reload
npm run dev

Adding New Detection Classes

Modify the class_names processing in analyzer.py to handle additional YOLO classes or custom models.

πŸ“Š Performance

  • Analysis Speed: ~2fps real-time processing
  • Model Size: YOLOv8 nano (~6MB)
  • Memory Usage: Optimized for session-based processing
  • Supported Resolution: Up to 4K video input

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ™ Acknowledgments

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published