Neural Network Based project to interpolate missing data.

This project does not use any Machine learning based library to build the predicting model

This project is mainly capable of predicting/interpolating respondent's unanswered question from a survey .

My model utlize kmeans clustering and countvectorizer to transfer natural language into vector, apply one-hot-encoding to convert dataset into training dataset. Implemented both neural network with one hidden layer, and Autoencoder with two hidden layers model to predict/interpolates answer. Developed the directed graph about mutual dependenies between different feature based on the result of using neural network to analyze the dataset. Use sentiment analysis based on logistic regression model to analyze the result of natural language answers.

Tricky Part - Limited Size of Dataset The size of survey dataset is relatively small, only 4000 responds answered the survey, and most of them did not answer every question in the survey.

Solution:

Create lots of training data by knocking out various subsets of features for a certain respondent, so that I enlarge the size of training dataset, also it is a way to prevent over-fitting.

Generated Dependency Graph for Feature

Full report can be found here

Usage

Use Jupyter Notebook to run ./FINAL.ipynb

License

MIT License. See LICENSE.md.

Contact

haodong.zheng@rutgers.edu

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
feature_relationship_mining		feature_relationship_mining
pic		pic
Analyzethedata.csv		Analyzethedata.csv
Convert dependencies graph to edge list .ipynb		Convert dependencies graph to edge list .ipynb
FINAL.ipynb		FINAL.ipynb
ML3AllSites.csv		ML3AllSites.csv
NLP_PROCESS.ipynb		NLP_PROCESS.ipynb
README.md		README.md
Tab.delimited.Cleaned.dataset.WITH.variable.labels.csv		Tab.delimited.Cleaned.dataset.WITH.variable.labels.csv
edge_list.csv		edge_list.csv
finaldependency_list.txt		finaldependency_list.txt
generateddata.csv		generateddata.csv
highpower.csv		highpower.csv
lower.csv		lower.csv
lowpower.csv		lowpower.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neural Network Based project to interpolate missing data.

Full report can be found here

Usage

License

Contact

About

Uh oh!

Releases

Packages

Languages

454400098/Neuralnetwork-based-Data-Completion-and-Interpolation-

Folders and files

Latest commit

History

Repository files navigation

Neural Network Based project to interpolate missing data.

Full report can be found here

Usage

License

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages