Skip to content

A complete project analyzing 250+ top-rated Indian movies (1955–2024) using real-world IMDb data. This project combines Python (Pandas, NumPy) and Power BI to uncover audience preferences, industry contributions, top genres, and trend shifts across decades.

Notifications You must be signed in to change notification settings

niravtrivedi23/IMDb-Top-250-Indian-Movie-Analysis

Repository files navigation

🎬 IMDb Top 250 Indian Movie Analysis (1955–2024)

A complete data-driven project that explores the evolution of Indian cinema by analyzing 250+ top-rated movies from IMDb using Python and Power BI.


🧠 Project Objective

To uncover insightful patterns and audience preferences in India’s most iconic films by analyzing:

  • ⭐ IMDb Ratings & Audience Votes
  • 📅 Year-wise Trends
  • 🗣️ Language & Genre Preferences
  • 🏭 Industry Contributions
  • 🎯 Movie Clusters Based on Popularity & Performance

📅 Dataset Overview

Feature Detail
📁 Dataset IMDb Top 250 Indian Movies
📆 Duration 1955 – 2024
📄 File Size ~250 Rows × 8 Columns
📍 Source Curated IMDb List

🐍 Python Analysis – Step-by-Step

🔗 movie_analysis.py (Jupyter Notebook)
🔗 📄 PDF Summary of Python Analysis

✅ Step 1: Data Cleaning

  • Removed commas & symbols from vote/rating fields
  • Fixed null values, typos, and date formats
  • Converted datatypes for numeric operations

📊 Step 2: Descriptive Statistics

Metric Value
🎭 Most Common Genre Drama
🗣️ Top Language Hindi (102)
🌟 Avg. IMDb Rating 8.19

📈 Step 3: Trend Analysis

  • 📅 Peak Years: 2018 & 2021 (highest releases)
  • 🏆 Top Rated: Mayabazar, Sandesham – IMDb 9.1
  • 🗳️ Most Voted: Jai Bhim – 3.3K+ reviews

🏭 Step 4: Industry Insight

Industry Award Share
Bollywood 59.4%
Kollywood ~20%
Tollywood ~15%

📊 Power BI Dashboard – Visual Insights

🔗 📊 Power BI Dashboard File (.pbix)

💡 Key Features

  • 🎬 Top-Rated & Most Voted Movies
  • 📅 Release Trends Over Time
  • 🗣️ Language and Industry Filters
  • 📈 Genre-Based Heatmaps
  • 🧠 Insight Cards for Quick Understanding

📌 Top Findings

Metric Highlight
🥇 Top Industry Bollywood – 59.4% award share
📅 Peak Years 2018 & 2021 – most releases
🌟 Top Rated Mayabazar, Sandesham (9.1 IMDb)
🗳️ Most Voted Jai Bhim (3.3K+ reviews)
🗣️ Top Language Hindi – 102 Movies
🎭 Popular Genre Drama – most recurring genre

🖼️ Dashboard Preview

Indian Movie Dashboard


🧾 Files Included

File Name Description
IMDB_Movies_India.csv Cleaned data (1955–2024)
movie_analysis.py Python code for data cleaning + EDA
Movie_Analysis_In_Python.pdf PDF overview of Python analysis
Indian_Movie_Dashboard.pbix Power BI dashboard file
Indian_Movie_Dashboard.png Static preview image of the dashboard

🔧 Tools Used

Tool Purpose
🐍 Python Data cleaning, trend analysis, clustering
📄 Excel Validation, preprocessing
📊 Power BI Data storytelling, dashboard creation

🧠 Final Conclusion

This project blends data with cinema to reveal:

  • How Indian cinema has changed over time
  • What makes a movie resonate with audiences
  • Which industries, genres, and years delivered standout content

📊 Whether you're a data lover or a movie fan, this dashboard makes it easy to explore India’s cinematic journey — one insight at a time.


🤝 Connect With Me

Platform Link
📧 Email niravtrivedi069@gmail.com
🔗 LinkedIn linkedin.com/in/trivedi-nirav-a1760424b
💻 GitHub github.com/niravtrivedi23

🎥 "Every movie tells a story — and so does the data behind it."
Let’s connect and explore the insights hidden in every frame!

About

A complete project analyzing 250+ top-rated Indian movies (1955–2024) using real-world IMDb data. This project combines Python (Pandas, NumPy) and Power BI to uncover audience preferences, industry contributions, top genres, and trend shifts across decades.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published