A modern web application that provides real-time object detection and scene analysis for uploaded videos using YOLOv8 and computer vision.
- π¬ Auto-Play Analysis - Videos start automatically with live object detection
- π€ Real-time Detection - YOLOv8-powered object recognition at 2fps
- π Scene Descriptions - Natural language descriptions of detected scenes
- π― Bounding Box Overlay - Visual detection results overlaid on video
- π¬ Live Commentary - YouTube-style comment feed with timestamps
- π Synchronized Playback - Analysis syncs with video play/pause state
- π± Responsive Design - Works on desktop and mobile devices
- β‘ Fast Processing - Optimized for real-time performance
- FastAPI - Modern async web framework
- YOLOv8 (Ultralytics) - State-of-the-art object detection
- OpenCV - Computer vision and video processing
- PyTorch - Deep learning framework
- NumPy - Numerical computing
- Next.js 14 - React framework with App Router
- TypeScript - Type-safe development
- Tailwind CSS - Utility-first styling
- HTML5 Canvas - Frame capture and overlay rendering
- Python 3.8+
- Node.js 16+
- npm or yarn
- Clone the repository
git clone https://github.com/yourusername/realtime-video-analysis.git
cd realtime-video-analysis
- Install Python dependencies
pip install fastapi uvicorn ultralytics opencv-python torch numpy
- Run the backend server
python main.py
The API will be available at http://localhost:8000
- Navigate to frontend directory
cd frontend # or wherever your Next.js app is located
- Install dependencies
npm install
# or
yarn install
- Set environment variables
# Create .env.local
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000
- Run the development server
npm run dev
# or
yarn dev
The app will be available at http://localhost:3000
- Upload Video: Drag & drop or select a video file (MP4, MOV, AVI, WebM)
- Automatic Analysis: Video auto-plays with real-time object detection
- View Results: See live detection messages and bounding boxes
- Navigate: Click timestamps to jump to specific moments
- MP4
- MOV
- AVI
- WebM
POST /upload-video
Content-Type: multipart/form-data
Form Data:
- file: video file
POST /analyze-frame
Content-Type: application/json
{
"frame_data": "base64_encoded_image",
"timestamp": 12.34
}
GET /video/{video_id}
GET /health
realtime-video-analysis/
βββ backend/
β βββ main.py # FastAPI application
β βββ analyzer.py # YOLOv8 analysis engine
β βββ requirements.txt # Python dependencies
βββ frontend/
β βββ app/
β β βββ page.tsx # Main application page
β β βββ components/
β β βββ VideoUploadPlayer.tsx
β β βββ DetectionMessages.tsx
β βββ package.json
β βββ tailwind.config.js
βββ README.md
VisionAnalyzer
- Core YOLOv8 analysis engineFastAPI Routes
- REST API endpointsFrame Analysis
- Real-time video frame processingScene Description
- Natural language generation
VideoUploadPlayer
- Video upload and playback with analysisDetectionMessages
- Real-time message feedCanvas Overlay
- Bounding box visualization
- Analysis Rate: 2fps (500ms intervals)
- Model: YOLOv8 nano (yolov8n.pt) for speed
- Video Storage: Temporary session-based storage
- Auto-play: Enabled by default
- CORS: Configured for cross-origin video access
- Layout: Fixed-height responsive design
The application includes comprehensive logging:
π === FRAME ANALYSIS START ===
π Frame dimensions: (720, 1280, 3)
π― Total detections: 3
π === GENERATING SCENE DESCRIPTION ===
π― analyzeFrame called
πΉ Video ref: true Canvas ref: true
π Sending analysis request
β
Analysis successful
# Backend with auto-reload
uvicorn main:app --reload --host 0.0.0.0 --port 8000
# Frontend with hot reload
npm run dev
Modify the class_names
processing in analyzer.py
to handle additional YOLO classes or custom models.
- Analysis Speed: ~2fps real-time processing
- Model Size: YOLOv8 nano (~6MB)
- Memory Usage: Optimized for session-based processing
- Supported Resolution: Up to 4K video input
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Ultralytics YOLOv8 - Object detection model
- FastAPI - Web framework
- Next.js - React framework
- OpenCV - Computer vision library