GitHub - lenow520/UCI-Beijing-PM2.5: Final project for 'Analysis and Application of Big Data'

2020 Analysis and Application of Big Data final project
Dataset used: Beijing PM2.5 Data Set

Goal: To figure out the correlation of the weather conditions and the pm2.5 indices.
Several classifying methods(Random Forest/Baye's/Logiatic Regression) have been applied.

Data Preprocessing
- Missing values replaced by arithmetic mean of the rest of the values.
- Remove useless/redundant data.
- Encoding categorical data.
- Data binning('excellent', 'good', 'light', 'moderate','heavy',and 'severe')
- Data normalization(min-max normalization).
- Splitting the dataset into training set and test set.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
PRSA_data.csv		PRSA_data.csv
README.md		README.md
final.ipynb		final.ipynb
preprocesseddata.ipynb		preprocesseddata.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

lenow520/UCI-Beijing-PM2.5

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages