GitHub - ramachandra742/document-classification-ML: Document classification using ML

Document-Classification

This project aims to classify documents based their categories in the corpus. There are 12 categores in the corpus viz.,
'books', 'cinema','cooking', 'gaming', 'sports', 'tech', 'data_science', 'design', 'news', 'politics', 'do_it_yourself', & 'business'. Logistic regression, SVC, & MultinomialNB algorithms are used for classifying documents.

Installation

Clone this repo:

git clone git@github.com:ramachandra742/Document-Classification.git

Download dataset

Check here

References

Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning Book by Benjamin Bengfort, Rebecca Bilbro, and Tony Ojeda

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
CorpusLoader.py		CorpusLoader.py
Document Classification.ipynb		Document Classification.ipynb
LICENSE		LICENSE
PickledCorpusReader.py		PickledCorpusReader.py
README.md		README.md
_config.yml		_config.yml
dataset_info.py		dataset_info.py
model.py		model.py
results.py		results.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Document-Classification

Installation

Download dataset

References

About

Uh oh!

Releases

Packages

Languages

License

ramachandra742/document-classification-ML

Folders and files

Latest commit

History

Repository files navigation

Document-Classification

Installation

Download dataset

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages