- Start here: Scroll to the Dashboard Preview to see the Power BI visuals.
- SQL logic: All queries are in the sql-queries/ folder.
- Access the Dashboard: Download the PBIX file (click the link, then “Raw” to download).
- Quick skim: - README.md – walks through objectives, queries, and insights step by step.
This project analyzes a curated sample of the top 10 ROI-performing films drawn from a cleaned dataset of 600+ titles (1997–2016). The aim is to simulate how a customer-insights / business analyst would respond to an executive ask: “What traits consistently show up in our biggest wins?”
Objective: identify repeatable signals—genre, director, budget, runtime, production company—that correlate with outsized ROI and can inform smarter green-light decisions.
Tools & approach
- SQL: data cleaning, joins, views to shape analysis tables
- Excel: validation checks, calculated fields, quick sanity tests
- Power BI: data model, DAX measures, interactive visuals & slicers
Dashboard pages
- Overview: ROI performance at a glance (films, directors, actors, studios)
- Detailed ROI: top performers with drill-downs
- Trends Over Time: ROI & revenue patterns by release year and runtime
- Genre Analysis: category-level ROI comparisons
- Findings: concise recommendations supported by the data
- Behind the Dashboard: build notes and assumptions
Outcome (high level): low-budget categories—especially horror/documentary—frequently outperform on ROI; select studios and directors recur among top performers, suggesting targeted bets can amplify returns.
- Dashboard Preview
- Business Objective
- Key Insights From the Analysis
- Recommendations
- Data Source
- Data Preparation and Cleaning
- SQL Queries
- Exploratory Data Analysis (EDA)
- Real World Role Alignment
- Folder Structure
- Project Progress and Next Steps
- Reflection
- Contact
- Disclaimer
This dashboard analyzes ROI trends across 600+ films released between 1997–2016. Results highlight how low-budget horror and documentary films consistently outperform big-budget productions, with Blumhouse Productions, Paranormal Activity, and director Oren Peli dominating ROI performance. The findings reinforce how lean production strategies often deliver the strongest financial returns.
Metric | Value |
---|---|
ROI Formula | (Revenue – Budget) / Budget |
# of Titles Analyzed | ~600 films (cleaned dataset) |
Time Window | 1997–2016 releases |
Top ROI Genre | Horror (~3.1 ROI) |
Lowest ROI Genre | Crime (~1.4 ROI) |
Top Production Company | Blumhouse Productions (3.2K+ ROI) |
Top Director | Oren Peli (12.9K ROI – Paranormal Activity) |
Top Actor (Avg ROI) | Michael C. (4.1K ROI – Paranormal Activity) |
Most Profitable Film | Paranormal Activity (12.9K ROI) |
Avg ROI (All Films) | ~7.4 |
Avg Runtime (Top 10) | ~100–110 minutes |
High-level dashboard summarizing ROI across films, directors, actors, and production companies.
Compares the ten most profitable films by ROI, emphasizing patterns across production studios and genres.
Lists top individual films by ROI, showing how modest budgets can drive extraordinary returns.
Highlights low-budget horror’s dominance, with Blumhouse, Oren Peli, and Paranormal Activity leading ROI.
Explores yearly ROI patterns and correlations with average film lengths.
Shows how film runtimes and release years impact ROI trends, highlighting shifts over decades.
Reveals ROI distribution by genre, with horror and documentary outperforming big-budget action and drama.
Summarizes strategies used by top-performing films and studios to maximize ROI despite budget constraints.
Documents project build: Power BI, Power Query, SQL, and DAX applied to simulate a real-world analytics request.
Studios invest millions into film production with no guarantee of return. This project examines a targeted subset of the top 10 ROI performers from a larger dataset of 600+ films (1997–2016) to answer key business questions:
- What traits define the most financially successful films?
- Are there repeatable patterns by genre, director, runtime, or studio?
- Do low-budget films reliably outperform high-budget ones?
- Which studios or production approaches consistently deliver outsized returns?
The analysis is framed as if an executive asked:
“We’ve had a few huge wins — how do we find more like those?”
The project reflects how an analyst would approach the request:
- Clean and prepare raw data (SQL, Excel)
- Analyze and surface findings (SQL queries, descriptive analysis)
- Communicate visually with interactive dashboards (Power BI)
Outcome: a multi-page dashboard highlighting repeatable traits of high-ROI films, built to simulate a real-world executive briefing.
- Proven directors dominate revenue: James Cameron, Joss Whedon, and the Russo brothers appear multiple times among the all-time box office leaders.
- Franchises and recurring IP drive scale: Marvel and Disney consistently leverage shared universes, sequels, and bankable actors to maximize global box office returns.
- Animation delivers high average revenue: Pixar and Disney titles show strong monetization potential, even if they don’t top ROI charts.
- Micro-budgets deliver massive ROI: Paranormal Activity achieved ~12,900% ROI on a ~$20K budget, far outpacing blockbuster tentpoles.
- Horror and documentary dominate efficiency: Low-cost genres generated the highest ROI, while crime films consistently underperformed.
- Blumhouse shows a repeatable ROI model: The studio’s lean production approach demonstrates how ROI can be engineered.
- Star power not required: Unknown actors (e.g., Michael C. in Paranormal Activity) led top ROI films, showing concept and execution matter more than celebrity.
- Runtime sweet spot ~100–110 minutes: Top ROI titles cluster around this range, suggesting concise runtimes optimize cost and engagement.
- Replicate micro-budget horror/doc models(Blumhouse-style).
- Target runtimes around 100-110 minutes.
- Use franchises/IP for revenue scale, pair with lean originals for ROI.
- Consider animation as a reliable high revenue genre.
This project uses a modified version of the publicly available movie dataset from Kaggle: https://www.kaggle.com/datasets/utkarshx27/movies-dataset
The dataset includes information on genre, revenue, budget, production companies, and more.
It has been cleaned and structured for educational and analytical purposes in this project.
- Completed missing financials: Filled in missing budget and revenue values using external sources (IMDb, Box Office Mojo) to ensure accurate ROI calculations.
- Standardized formats: Converted release dates into a consistent year format for trend analysis.
- Filtered invalid records: Removed films with zero revenue and excluded entries with invalid or missing genre labels.
- Normalized categorical fields: Cleaned and standardized genre and production company names to resolve duplicates and inconsistencies (e.g., “Universal Pictures” vs. “Universal”).
- ROI formula: (Revenue – Budget) ÷ Budget
- Sample size: ~600 films (1997–2016)
- Top 10 ROI sample: Used for efficiency analysis; focuses on extremes (survivorship bias acknowledged).
- Revenue leaders: Ranked by global box office revenue.
-
Top 10 Performing Movies
Returns the 10 highest-ROI films and their key contributors (director, studio, lead actor), forming the basis for the Top ROI Films analysis. -
Budget to Revenue Ratio
Calculates ROI for all films in the dataset using the formula(Revenue – Budget) / Budget
, enabling dataset-wide comparisons. -
Genre Summary
Aggregates total and average revenue, budget, and ROI by genre to highlight which categories consistently outperform or underperform. -
Compare to Genre Averages
Benchmarks each film’s performance against its genre’s average ROI, surfacing outliers that dramatically over- or under-deliver. -
Top Productions by Company
Identifies studios and production companies most frequently linked to high-ROI titles (e.g., Blumhouse), demonstrating repeatable business models. -
Top 10 Movies vs Average Metrics
Compares top-performing films to dataset averages in runtime, budget, and revenue, uncovering differences that explain ROI outperformance. -
Release Year + Runtime Overview
Analyzes how release year and runtime correlate with ROI, supporting insights into time-period trends and optimal film length. -
Updated View for Movie Dataset
Consolidates cleaned budget, revenue, ROI, runtime, genre, and production company fields into a single view for use in Power BI dashboards.
Before building the dashboard, I ran exploratory checks to understand patterns in the data. These visuals highlight how ROI is distributed, how it varies by genre, and how budget size relates to financial efficiency.
The majority of films generate modest ROI (0–5x budget). Only a few reach extreme levels above 100x, showing how rare runaway hits are.
Horror and Documentary films show higher ROI potential, while genres such as Crime underperform. Outliers were capped to keep the chart readable.
Smaller budgets cluster at higher ROI, while large-budget films are spread across the full range — confirming that blockbuster spending does not guarantee profitability.
These exploratory findings informed the dashboard design and final recommendations.
This project simulates the type of analysis commonly performed in roles such as:
- Customer Experience / Voice of Customer (CX / VoC) Analysts – spotting success patterns in customer or content outcomes and translating them into actionable strategies.
- Business Analysts – supporting investment decisions with evidence from structured data (e.g., budget, ROI, performance trends).
- Media / Content Analysts – analyzing creative, genre, and production data to recommend high-yield strategies for future projects.
Key skills demonstrated in this project:
- Data preparation & cleaning – filled gaps, standardized formats, normalized categories for analysis-ready data.
- Exploratory SQL analysis – uncovered top ROI performers, genre trends, and production patterns using targeted queries and a consolidated view.
- Insight communication – transformed raw results into clear, scoped takeaways aligned to executive-style questions.
- Dashboard design (Power BI) – built multi-page, interactive reporting with slicers, genre filters, and drill-downs.
- Data modeling (Power Query & DAX) – merged multiple tables, defined ROI measures, and applied filtering logic for consistent metrics.
- sql-queries/ – Contains all SQL queries used to analyze the data
- images/ – Contains all screenshots used in the README
- movie_roi_dashboard_update.pbix – Power BI file for interactive exploration
- README.md – Project overview, business questions, and query links
Completed to date
- SQL analysis delivered through 7 structured queries plus a consolidated reporting view.
- Multi-page Power BI dashboard built to visualize ROI patterns by genre, director, studio, and runtime.
- Data preparation and cleaning steps documented for analysis reliability.
Planned improvements
- Share findings publicly in a LinkedIn post to demonstrate professional impact.
- Add a few new DAX measures to deepen ROI tracking and filtering options.
- Refresh the dataset with films released through 2025, including streaming and international box office, for broader comparisons.
This project helped me practice taking a messy dataset and turning it into a story for a business audience.
I strengthened my skills in SQL joins, Power BI modeling, and building multi-page dashboards.
I also learned the importance of documenting cleaning steps so others can reproduce the work.
While I focused my insights on the Top 10 ROI films for clarity, I also built a full dataset view (~600 films) to validate patterns and avoid relying only on outliers.
If I continued this project, I would add regression analysis to measure which variables most strongly predict ROI.
Overall, this project gave me confidence in connecting data preparation, analysis, and communication into one complete case study.
Aaron Zeug
Customer Experience & Reporting Specialist
Building a career in Data Analytics with a focus on clear, insight-driven storytelling.
8+ years in CX and quality, now applying SQL, Excel, and Power BI to real-world analysis projects.
GitHub Profile • LinkedIn
This project is for educational and portfolio purposes only.
The analysis is based on publicly available data and reflects exploratory insights only.
It does not constitute financial, investment, or strategic business advice.
All interpretations are illustrative and intended to demonstrate technical and analytical skills.
Any actions taken based on this material are at the user’s own discretion.