This project uses convolution neural networks (CNNs) to "sort" 16 different types/classes of documents. We created our own random subset of 1,250 images from the RVL-CDIP (Ryerson Vision Lab Complex Document Information Processing) dataset, which consists of 400,000 grayscale images with 25,000 images per document class. We used 850 images for training, 210 images for validation, and 200 images for testing. We tested two different CNNs on our classification problem: one simple CNN and one CNN that uses transfer learning. The base model we used for transfer learning was the VGG16 model.
- JupyterLab - In order to open the iPython notebooks used in this project you'll need to install JupyterLab. You can install this from https://jupyter.org/install
- Python 3 or higher is necessary to run the iPython notebooks. We reccommend using Python 3.7.6 since that's what we used for this project. Python 3 can be found here https://python.org/downloads
After you download the dataset, add it to the demo folder in your local copy of our GitHub repository. You can download our dataset from this link
In order to run the project demo, go to the demo folder and click on the Demo.ipynb. This file contains a guided walk-through of the project code and our results.
- Dylan Fox
- Emily Turner
- Tyler Christian
- Anthony Ghebranious
- Munayfah Albaqami
"Transfer learning / fine-tuning". https://colab.research.google.com/github/kylemath/ml4a-guides/blob/master/notebooks/transfer-learning.ipynb#scrollTo=hLXTofcNYoa2
J. Phillips. "Open Lab 6: Convolution Neural Networks", March 2021. https://www.cs.mtsu.edu/~jphillips/courses/CSCI4850-5850/private/Open_Lab_6.pdf