Skip to content

Team-driven data science project analyzing how directors, genres, budgets, and timing influence movie success using IMDb data, machine learning, and visual storytelling.

Notifications You must be signed in to change notification settings

usjav/movie-success-analytics-cs334

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎬 Unveiling the Director’s Vision: A Deep Dive into Movie Data Analytics

πŸ“Œ Project Overview

This collaborative Data Science project explores how directors, budgets, genres, and release timing influence movie success. Using scraped IMDb data and machine learning techniques, our team of five analyzed performance trends to guide decision-making in the film industry.

πŸ“Š Dataset Description

The dataset includes:

  • πŸŽ₯ Director names
  • πŸ’° Movie budgets
  • 🎭 Genres
  • πŸ“… Release dates
  • πŸ’΅ Box office earnings
  • ⭐ IMDb ratings

πŸ” Objectives

  • Pinpoint which directors consistently drive commercial and critical success
  • Model the impact of budget and genre on box office performance
  • Analyze seasonal release trends
  • Provide actionable insights for studios and producers

πŸ› οΈ Repository Contents

β”œβ”€β”€ extract_data.ipynb      # IMDb scraping logic and raw data generation
β”œβ”€β”€ raw_data.csv            # Unprocessed dataset
β”œβ”€β”€ main.ipynb              # Data cleaning, EDA, visualizations, ML models
β”œβ”€β”€ README.md               # Project overview and documentation

βš™οΈ Tools & Techniques

  • Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
  • Methods:
    • Regression modeling
    • Correlation and trend analysis
    • Feature engineering & encoding
    • Train/test split and performance evaluation

πŸ“ˆ Key Insights

  • Some directors demonstrate consistent profitability and audience acclaim.
  • Genre choice and timing of release significantly influence box office outcomes.
  • Budget optimization strategies can be inferred from predictive modeling results.

πŸ“š Blog Post

For a deeper narrative walkthrough and key takeaways: πŸ‘‰ Read the blog post

πŸ‘₯ Team Contributions

This was possible by a brilliant team of 5 members [myself, Bilal, Shaheer, Hasnain, Ahad] involved in scraping, cleaning, modeling, and presenting results. Collaboration was key in shaping this analytical story.

πŸ’‘ Future Improvements

  • Model tuning and validation with more complex ML architectures
  • External datasets (Rotten Tomatoes, Metacritic) for enriched analysis
  • Dashboard-style visualization using Plotly or Streamlit

πŸ“¬ Connect

Feel free to reach out on GitHub to discuss the project, data science, or potential collaborations!

About

Team-driven data science project analyzing how directors, genres, budgets, and timing influence movie success using IMDb data, machine learning, and visual storytelling.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published