Skip to content

RitvikPatil/PDF-to-Text-Tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation


PDF to Text Tool

Explore the Tool »

Developer's contact

About The Project

This is an all-in-one tool:

To convert PDF pages to images To extract text from PDF Documents using Optical Character Recognition (using pytesseract).

PDF pages --> Images of pages --> Text extracted with OCR

Getting Started

Run - python main.py <pdf_file_path>

Prerequisites

List of python libraries you need to implement the project.

  • pdf2image
  • pillow
  • pytesseract

Contributing

Any contributions/suggestions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b PDF-to-Text-Tool/suggestion)
  3. Commit your Changes (git commit -m 'Add some suggestion')
  4. Push to the Branch (git push origin PDF-to-Text-Tool/suggestion)
  5. Open a Pull Request

Contact

Ritvik Patil - pritvik0@gmail.com

Project Link: https://github.com/RitvikPatil/PDF-to-Text-Tool

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages