This project performs an Exploratory Data Analysis (EDA) on trending YouTube videos, using the USvideos.csv dataset I got from Kaggle. The goal was to uncover meaningful patterns and insights related to views, likes, comments, tags, publishing times, and video categories. The repo contains the necessary files; dataset, .ipynb, and a .json file.
Python
Pandas & NumPy
Matplotlib & Seaborn
WordCloud
NLTK
JSON
π Trending Duration: Analyzed how many days each video stayed on the trending list. Explored the relationship between trending duration and average views.
π Correlation Study: Investigated how metrics like views, likes, comments, and trending days are correlated. Used heatmaps and pairplots for clear visual understanding.
π Tags & Titles Insights: Identified the most common tags used in the top 10% of trending videos. Used WordCloud to visualize frequently used words in titles and descriptions.
π Publishing Time Impact: Explored how the day of the week and hour of publishing affect video performance. Found the best timeframes for maximizing views.
π Category Analysis: Mapped category IDs to human-readable names using a JSON file. Visualized which categories trend the most using a donut chart.