Skip to content

MantriYash/Google-Play-Store-Feature-Engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Project Description

This project performs a comprehensive Exploratory Data Analysis (EDA) and Feature Engineering on the Google Play Store Dataset. The goal is to clean messy data, engineer meaningful features, and prepare the dataset for deeper analysis and potential machine learning modeling.

Skills & Processes Demonstrated

--Data Cleaning & Preprocessing --Handling Missing and Invalid Values --String Manipulation and Conversion --Feature Engineering (e.g., extracting day, month, year from dates) --Data Type Optimization --Saving Processed Data --Working with Time-Series Data --Preparing Data for ML Models

Dataset

Source: Google Play Store Dataset (CSV):https://raw.githubusercontent.com/krishnaik06/playstore-Dataset/main/googleplaystore.csv

Fields: App Name, Category, Rating, Reviews, Size, Installs, Price, Type, Content Rating, Genres, Last Updated, Current Version, Android Version.

Improvements and Future Work

--Advanced Cleaning: Handle missing values in Size with predictive imputation rather than dropping or ignoring. --Data Visualization: Create interactive visualizations using libraries like Plotly or Altair. --Statistical Analysis: Explore correlation between app size, installs, and ratings. --Clustering: Group apps into clusters based on installs, rating, and price. --Machine Learning: Predict an app's success (high installs/rating) based on features. --Deployment: Create a dashboard summarizing the cleaned dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published