DocuSort

Summary

This project uses convolution neural networks (CNNs) to "sort" 16 different types/classes of documents. We created our own random subset of 1,250 images from the RVL-CDIP (Ryerson Vision Lab Complex Document Information Processing) dataset, which consists of 400,000 grayscale images with 25,000 images per document class. We used 850 images for training, 210 images for validation, and 200 images for testing. We tested two different CNNs on our classification problem: one simple CNN and one CNN that uses transfer learning. The base model we used for transfer learning was the VGG16 model.

Getting Started

Installing Necessary Programs

JupyterLab - In order to open the iPython notebooks used in this project you'll need to install JupyterLab. You can install this from https://jupyter.org/install
Python 3 or higher is necessary to run the iPython notebooks. We reccommend using Python 3.7.6 since that's what we used for this project. Python 3 can be found here https://python.org/downloads

Download the Dataset

After you download the dataset, add it to the demo folder in your local copy of our GitHub repository. You can download our dataset from this link

Demo

In order to run the project demo, go to the demo folder and click on the Demo.ipynb. This file contains a guided walk-through of the project code and our results.

Members

Dylan Fox
Emily Turner
Tyler Christian
Anthony Ghebranious
Munayfah Albaqami

References

"Transfer learning / fine-tuning". https://colab.research.google.com/github/kylemath/ml4a-guides/blob/master/notebooks/transfer-learning.ipynb#scrollTo=hLXTofcNYoa2

J. Phillips. "Open Lab 6: Convolution Neural Networks", March 2021. https://www.cs.mtsu.edu/~jphillips/courses/CSCI4850-5850/private/Open_Lab_6.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
Demo		Demo
Paper		Paper
Project		Project
LICENSE		LICENSE
Project_Milestones.ipynb		Project_Milestones.ipynb
Project_Proposal.ipynb		Project_Proposal.ipynb
README.md		README.md
sigmoid-presentation.pptx		sigmoid-presentation.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DocuSort

Summary

Getting Started

Installing Necessary Programs

Download the Dataset

Demo

Members

References

About

Uh oh!

Releases

Packages

Contributors 6

Uh oh!

Languages

License

CSCI4850/s21-team5-project

Folders and files

Latest commit

History

Repository files navigation

DocuSort

Summary

Getting Started

Installing Necessary Programs

Download the Dataset

Demo

Members

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Uh oh!

Languages

Packages