XOEROX AI is a document processing application that helps in conversion of unstructured data to structure data using Optical Character Recognition (OCR) as well as Open AI for enhanced text extraction and formatting. It allows users to upload PDF files, process them to extract text, and convert the text into structured Markdown as well as in JSON.
Project Proof of Concept : Click here
- Frontend: React, TypeScript, Tailwind CSS
- Backend: Node.js, Express, TypeScript
- OCR: Tesseract.js
- AI Integration: OpenAI GPT
- Image Processing: Sharp
- PDF to Image Conversion: pdf-poppler
- Upload PDF files for processing.
- Extract text using OCR and format it using Open AI or else Directly using Open AI for best result.
- View extracted content in Markdown or JSON format.
- Real-time status updates during processing.
- Node.js (v14 or later)
- npm or yarn
- OpenAI API Key
-
Clone the Repository
git clone https://github.com/samrathreddy/xoerox.git cd xoerox
-
Install Dependencies
Navigate to both the
client
andserver
directories and install the dependencies:# In the client directory cd client npm install # or yarn install # In the server directory cd ../server npm install # or yarn install
-
Environment Variables
Create a
.env
file in theserver
directory and add your OpenAI API key and other configurations:GPT_API_KEY=your_openai_api_key PORT=3000 OCR_LANGUAGE=eng BACKEND_URL=http://localhost:3000
Create a
.env
file in theclient
directory and add your backend url:BACKEND_URL=http://localhost:3000
-
Run the Application
Start both the client and server:
# In the client directory npm run dev # or yarn dev # In the server directory for directly building and running npm run build-run # or yarn build-run
-
Access the Application
Open your browser and navigate to
http://localhost:5173
to access the XOEROX application. (Check frontend url in client terminal, it might differ)
-
Client: Contains the React frontend code.
- Components like
Toolbar
,FileUpload
, andOutputViewer
manage the UI and user interactions. - TypeScript is used for type safety and better code management.
- Components like
-
Server: Contains the Node.js backend code.
- Services like
document.service.ts
andgpt.service.ts
handle document processing and AI integration. - Utilizes Tesseract.js for OCR and OpenAI's GPT for text processing.
- Services like
Contributions are welcome! Please fork the repository and submit a pull request for any improvements or bug fixes.