Skip to content

Anirudh58/document-classification-layoutlm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document Classification using LayoutLM

About

This repository can be used as a recipe for fine-tuning a LayoutLM model for document classification. The dataset used for training and evaluation is something specific to my use case. However, the code can be easily modified to work with any dataset.

Steps to reproduce

Folder Structure

  • Clone/unzip the project.
  • Place the dataset in a folder named data in the root directory of the project.
  • The final folder structure should look something like this:
├── data
│   ├── images
│   │   ├── 0
│   │   ├── 2
│   │   ├── 4
│   │   ├── 6
│   │   └── 9
│   └── ocr
│       ├── 0
│       ├── 2
│       ├── 4
│       ├── 6
│       └── 9
├── environment.yml
├── models
│   └── layoutlm-model
│       ├── config.json
│       └── pytorch_model.bin
├── notebooks
│   ├── dataset.ipynb
│   ├── dataset.pdf
│   ├── modeling.ipynb
│   └── modeling.pdf
├── README.md
└── src
    ├── dataset.py
    ├── __pycache__
    │   ├── dataset.cpython-39.pyc
    │   └── utils.cpython-39.pyc
    └── utils.py

Environment Setup

  • Create a conda environment with the given yml file as follows:
conda env create -f environment.yml
  • Activate the environment:
conda activate dcl
  • Note: I had configured my system to use my GPU with the following specifications. Some packages may have to be installed manually depending on your system and OS configuration.

    • OS: Ubuntu 22.04
    • GPU: NVIDIA GeForce RTX 3050
    • CUDA: 11.7
    • NVIDIA Driver: 515
    • PyTorch: 2.0.0+cu117
  • I used tesseract to peform OCR as I needed the text as well as bounding box information from the document images. Follow steps from here

Running the code

  • Open the dataset.ipynb notebook and run the cells in order.

    • Here, you can visualize a few document images and their corresponding bounding boxes.
    • This ensures that the dataset is correctly loaded and the bounding boxes are correctly extracted.
  • Open the modeling.ipynb notebook and run the cells in order.

    • You can modify some of the hyperparameters in the notebook to see how they affect the model performance.
    • During training, you should see a training accuracy of around 0.98 and a validation accuracy of around 0.9, after 5 epochs.
    • The model is then saved in the models folder.
    • During testing, the trained models are loaded and evaluated with the testing set. Here, you should see a test accuracy of around 0.94

Classification Statistics for my use case

  • Train Accuracy: 0.982
  • Validation Accuracy: 0.9
  • Test Accuracy: 0.944
  • Average Precision: 0.951
  • Average Recall: 0.944
  • Average F1-Score: 0.946

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published