Steps for a new data analysis project:
-
Check the data health; clean, transform and prepare the data for analysis (it corresponds to 60% to 80% of the time in a data analysis project):
- Missing values;
- Duplicate records;
- Redundant data: like, for example, total amount columns. Remove data that you do not need;
- Data types;
- Consistency of formats: whole numbers X decimal numbers, date formats, etc;
- Consistency of representations: differences in capitalization, spacing and genders of adjectives;
- Spelling errors.
-
Understand the data - EDA (Exploratory Data Analysis);
-
Define the audience;
-
From the exploratory analysis, prepare an explanatory material.
-
Data visualization - Choosing a chart:
-
Choosing colors for data visualization:
-
Practice makes perfection:
-
Data Science Competitions:
-
Useful:
-