A powerful web application that translates PDF documents to various languages while preserving the original layout, formatting, and background colors.
- Exact Layout Preservation: Maintains the original PDF layout including text positioning, images, and graphics
- Background Color Detection: Preserves the background color of each text block in the translated document
- Font Style Retention: Maintains bold, italic, and other text formatting from the original document
- Real-time Progress Updates: Provides WebSocket-based progress tracking during translation
- Multiple Language Support: Translates to any language supported by OpenAI's models
- Web Interface: User-friendly interface for uploading and translating PDFs
- Intelligent Text Wrapping: Handles cases where translated text is longer than the original
If you find this project useful, please consider giving it a star on GitHub! Your support helps make this project better.
Contributions are welcome and greatly appreciated! Here's how you can contribute:
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Feel free to check the Issues page for any open tasks or report bugs.
Notice how the application preserves:
- The exact layout and positioning of all elements
- Background colors of text blocks
- Font styles and formatting
- All images and graphical elements
The PDF Translator follows these steps:
- Text Extraction: Extracts text blocks from the original PDF while preserving their positions, font styles, sizes, and page numbers
- Background Color Detection: Analyzes each text block area to identify its background color
- Translation: Sends the extracted text to OpenAI's API for translation
- PDF Recreation: Creates a new PDF by:
- Copying the original PDF pages exactly
- Covering original text with rectangles matching the detected background color
- Adding translated text in the same positions with matching font styles
- Progress Tracking: Provides real-time updates throughout the process via WebSocket communication
- Python 3.8+
- OpenAI API key
- Clone this repository:
git clone https://github.com/Codehash001/gpt-pdf-translator.git
cd gpt-pdf-translator
- Install the required dependencies:
pip install -r requirements.txt
- Create a
.env
file in the project directory with your OpenAI API key:
OPENAI_API_KEY=your_openai_api_key_here
- Start the web server:
python main.py
-
Open your browser and navigate to
http://localhost:8000
-
Upload a PDF file, select the target language, and click "Translate"
-
Monitor the real-time progress updates during translation
-
Download the translated PDF when complete
The application provides the following API endpoints:
POST /translate-pdf
: Upload and translate a PDF fileGET /download/{filename}
: Download a translated PDF fileWebSocket /ws/{task_id}
: Connect to receive real-time progress updates
The application detects the background color of each text block by:
- Rendering a small area around the text block as an image
- Analyzing the color distribution to find the most common color
- Using that color when covering the original text before adding the translation
This ensures that colored backgrounds, highlighted text, and other design elements are maintained in the translated document.
When translated text is longer than the original (common in many language pairs), the application:
- Calculates the available space in the original text block
- Estimates how many characters can fit per line
- Applies intelligent word wrapping to keep the text within the original boundaries
- Adjusts line spacing as needed to accommodate the text
- Very complex PDF layouts with mixed text directions might require manual adjustment
- PDFs with custom fonts will use standard substitutes in the translated version
- Documents with text embedded in images require separate OCR processing (not included)
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing the translation API
- PyMuPDF (fitz) for PDF processing capabilities
- FastAPI for the web framework