📊 Netflix Case Study - Data Exploration and Visualization

🌟 Project Overview

This project involves performing data exploration and visualization on the Netflix dataset to gain insights that can help Netflix decide what types of shows/movies to produce and how to grow the business in various countries.

📈 Business Problem

Analyze the data to generate insights that could help Netflix decide which type of shows/movies to produce and how they can grow their business in different countries.

📁 Dataset

The dataset contains information about TV shows and movies available on Netflix, including the following attributes:

Show_id: Unique ID for every Movie/TV Show
Type: Identifier - A Movie or TV Show
Title: Title of the Movie/TV Show
Director: Director of the Movie
Cast: Actors involved in the movie/show
Country: Country where the movie/show was produced
Date_added: Date it was added on Netflix
Release_year: Actual Release year of the movie/show
Rating: TV Rating of the movie/show
Duration: Total Duration - in minutes or number of seasons
Listed_in: Genre
Description: Summary description

Objectives

Define the problem statement and analyze basic metrics.
Analyze the data structure, detect missing values, and generate a statistical summary.
Perform non-graphical analysis: value counts and unique attributes.
Visual Analysis:
- Univariate, Bivariate analysis using various plots (Distplot, Countplot, Boxplot, Heatmaps, Pairplots).
- Missing Value and Outlier check.
Derive business insights and make actionable recommendations.

📝 Recommendations

Focus on popular genres like Drama, Comedy, and International TV Shows/Movies.
Release TV Shows in July/August and Movies at the end or start of the year.
For the USA, produce movies of 80-120 minutes and Kids TV Shows.
For the UK, maintain the same movie length and target mature audiences.
In India, increase the number of movies as it has been declining since 2018.
Create Anime content for Japan and Romantic TV Shows for South Korea.
Consider popular actors/directors and their combinations while creating content.

Challenges Faced

Multiple Directors Issue: Some movies have two directors, making it difficult to perform certain operations. To manage this, I reduced the granularity.
Duration Column: The 'Duration' column had numerical values for movies (e.g., '120 min') but categorical values for series (e.g., '2 seasons'), requiring special handling.
Date Column Handling: The 'Date_added' column was recognized as an object data type by Pandas, hindering date extraction. I converted it using pd.to_datetime().
Missing Value Imputation: Replaced missing values with the most appropriate estimates to improve analysis accuracy.
EDA Challenges: Performed extensive univariate and bivariate analysis to extract meaningful insights from the data.

📂 Files

PDF Report: Netflix_case_study_2.0.pdf
Jupyter Notebook: Netflix_case_study_2.0.ipynb
Dataset: Netflix Data

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Netflix_case_study_2.0.ipynb		Netflix_case_study_2.0.ipynb
Netflix_case_study_2.0.pdf		Netflix_case_study_2.0.pdf
Netflix_logo.gif		Netflix_logo.gif
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊 Netflix Case Study - Data Exploration and Visualization

🌟 Project Overview

📈 Business Problem

📁 Dataset

Objectives

📝 Recommendations

Challenges Faced

📂 Files

About

Uh oh!

Releases

Packages

Languages

SamarthKolge-Analyst/Netflix-Data-Analysis-Continent-Strategy

Folders and files

Latest commit

History

Repository files navigation

📊 Netflix Case Study - Data Exploration and Visualization

🌟 Project Overview

📈 Business Problem

📁 Dataset

Objectives

📝 Recommendations

Challenges Faced

📂 Files

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages