Skip to content

buttaRahul/multimodal_video_summarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal Video Summarization

This project focuses on enhancing video summarization by integrating both textual and visual information. Instead of relying solely on the video transcript, it extracts key frames based on the importance of sentences in the transcript and generates descriptions for those key frames. These descriptions, along with the transcript, are then passed to a Large Language Model (LLM) to generate a more comprehensive and context-aware summary


Getting Started

Follow the steps below to clone the repository, install dependencies, and start both the backend and frontend servers.

Prerequisites

  • Python 3.8 or higher
  • Node.js 16 or higher
  • npm (comes with Node.js)
  • Git

Clone the Repository

# Clone the repository to your local machine
git clone https://github.com/buttaRahul/multimodal_video_summarization.git

# Navigate to the project directory
cd multimodal_video_summarization

Backend Setup

Install Dependencies

# Navigate to the backend directory
cd backend

# Create a virtual environment
python -m venv venv

# Activate the virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# Install Python dependencies
pip install -r requirements.txt

Start the Backend Server

Once dependencies are installed, start the backend server:

# Run the FastAPI backend server
uvicorn main:app --reload

# The server will run at http://127.0.0.1:8000

Frontend Setup

Install Dependencies

Navigate to the frontend directory and install the required JavaScript dependencies:

# Navigate to the frontend directory
cd ../frontend

# Install npm packages
npm install

Start the Frontend Server

# Start the React frontend server
npm run dev

# The frontend server will run at http://localhost:3000

Accessing the Application

Once both the backend and frontend servers are running:


Project Structure

  • backend/: Contains the FastAPI backend code.
  • frontend/: Contains the React frontend code.
  • requirements.txt: Lists Python dependencies for the backend.
  • package.json: Lists JavaScript dependencies for the frontend.

Project Demonstration

click the image to watch video demonstration Demo Video

About

This project focuses on enhancing video summarization by integrating both textual and visual information.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published