Skip to content

lenow520/UCI-Beijing-PM2.5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

2020 Analysis and Application of Big Data final project
Dataset used: Beijing PM2.5 Data Set

Goal: To figure out the correlation of the weather conditions and the pm2.5 indices.
Several classifying methods(Random Forest/Baye's/Logiatic Regression) have been applied.

  • Data Preprocessing
    • Missing values replaced by arithmetic mean of the rest of the values.
    • Remove useless/redundant data.
    • Encoding categorical data.
    • Data binning('excellent', 'good', 'light', 'moderate','heavy',and 'severe')
    • Data normalization(min-max normalization).
    • Splitting the dataset into training set and test set.

About

Final project for 'Analysis and Application of Big Data'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published