Intelligent Document Classification is an web-based document classification application deployed on the [DAIR Cloud](https://www.canarie.ca/cloud). Using Optical Character Recognition, Natural Language Processing, and Full-Text Search, the Automated Document Classification and Discovery BoosterPack can automate the creation of document metadata.
✅ Optical character recognition using Tesseract
✅ Natural language processing using PyTorch and Hugging Face Machine Learning Datasets
✅ Fulltext search and metadata storage using Elasticsearch
✅ Web interface for document upload and document search using ReactJS
✅ Event Streaming using Apache Kafka
# Build application
docker-compose -f docker-compose-dev.yml build
# Run application
docker-compose -f docker-compose-dev.yml up -d
Intelligent Document Classifcation is available under the Apache License V2.