GitHub - formkiq/intelligent-document-classification: DAIR BoosterPack for Intelligent Document Classification

Intelligent Document Classification is an web-based document classification application deployed on the [DAIR Cloud](https://www.canarie.ca/cloud). Using Optical Character Recognition, Natural Language Processing, and Full-Text Search, the Automated Document Classification and Discovery BoosterPack can automate the creation of document metadata.

Architecture

Features

✅ Optical character recognition using Tesseract

✅ Natural language processing using PyTorch and Hugging Face Machine Learning Datasets

✅ Fulltext search and metadata storage using Elasticsearch

✅ Web interface for document upload and document search using ReactJS

✅ Event Streaming using Apache Kafka

Run Local

# Build application
docker-compose -f docker-compose-dev.yml build

# Run application
docker-compose -f docker-compose-dev.yml up -d

License

Intelligent Document Classifcation is available under the Apache License V2.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
api		api
api_ml		api_ml
docs		docs
install		install
kafka		kafka
tesseract		tesseract
ui		ui
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose-dev.yml		docker-compose-dev.yml
docker-compose-prod.yml		docker-compose-prod.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Architecture

Features

Run Local

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

formkiq/intelligent-document-classification

Folders and files

Latest commit

History

Repository files navigation

Architecture

Features

Run Local

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages