GitHub - RitvikPatil/PDF-to-Text-Tool

PDF to Text Tool

Explore the Tool »

Developer's contact

About The Project

This is an all-in-one tool:

To convert PDF pages to images To extract text from PDF Documents using Optical Character Recognition (using pytesseract).

PDF pages --> Images of pages --> Text extracted with OCR

Getting Started

Run - python main.py <pdf_file_path>

Prerequisites

List of python libraries you need to implement the project.

pdf2image
pillow
pytesseract

Contributing

Any contributions/suggestions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b PDF-to-Text-Tool/suggestion)
Commit your Changes (git commit -m 'Add some suggestion')
Push to the Branch (git push origin PDF-to-Text-Tool/suggestion)
Open a Pull Request

Contact

Ritvik Patil - pritvik0@gmail.com

Project Link: https://github.com/RitvikPatil/PDF-to-Text-Tool

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
images_to_text.py		images_to_text.py
main.py		main.py
requirements.txt		requirements.txt
save_pdf_to_images.py		save_pdf_to_images.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF to Text Tool

About The Project

Getting Started

Prerequisites

Contributing

Contact

About

Uh oh!

Releases

Packages

Languages

RitvikPatil/PDF-to-Text-Tool

Folders and files

Latest commit

History

Repository files navigation

PDF to Text Tool

About The Project

Getting Started

Prerequisites

Contributing

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages