Copyright © 2025 Tomasz Kamiński
Discover why this is the world's first AI-OCR game translator with our comprehensive showcase featuring a video clip, competitive comparisons, and and all the innovative features! 🚀
Game-Changing Translator is a powerful desktop application that automatically captures text from any area of your screen, performs optical character recognition (OCR), and translates the text in real-time. It creates floating overlay windows that can be positioned anywhere on your screen, making it perfect for translating games, videos, PDFs, or any application with text that you can't easily copy and paste.
This project was inspired by a family member who was learning French by playing games with French subtitles and needed real-time translation. I hope it will be useful both for gamers and non-gamers alike for casual on-screen translations, whether you're learning a new language through entertainment or simply need to understand content in a foreign language.
This application was developed or rather vibe-coded with the support of the following AI models: Claude 3.7 Sonnet, Claude Sonnet 4 and Gemini 2.5 Pro.
🎮 ► Watch Live Demo: The Witcher Real-Time Translation
Click the thumbnail above or the link to watch on YouTube
Experience our revolutionary AI OCR technology translating The Witcher's challenging subtitles in real-time! This demo showcases:
- ✨ Gemini AI OCR handling complex backgrounds and stylised fonts that break traditional OCR engines
- 🧠 Context-aware translation maintaining narrative coherence across dialogue
- ⚡ Real-time performance with floating overlays that work seamlessly over any game
- 🎯 Superior accuracy where competitors fail on challenging subtitle scenarios
See why this is the world's first AI-OCR game translator that actually works with difficult gaming content!
- Multiple Gemini Models Available: Flexible model selection for OCR and translation operations with dynamic configuration
-
Gemini 2.5 Flash-Lite: Recommended for speed - optimal for fast-changing subtitles (< 1 second display time)
-
Gemini 2.0 Flash: Recommended for quality - superior OCR accuracy and most idiomatic translations for longer subtitles (1+ seconds)
-
Comprehensive OCR Testing: Curious which models handle challenging subtitles most effectively and consistently?
View detailed test results showing Gemini 2.0 models' superior accuracy across multilingual content
-
Dynamic Model Configuration: Models available for selection are controlled by
resources/gemini_models.csv
for easy customization -
Performance-Optimized Selection: Choose different models for OCR vs translation based on your specific needs
-
- Enhanced API Integration: Migration from
google.generativeai
togoogle.genai
library for improved performance and stability - Optimized Threading: Enhanced OCR and translation thread performance for faster processing and reduced latency
- Gemini OCR - Premium Text Recognition: Revolutionary AI-powered OCR that delivers exceptional accuracy for challenging subtitle scenarios where traditional OCR engines struggle
-
Superior OCR quality with outstanding cost-to-quality ratio using advanced Gemini models
-
Challenging Screenshot Examples: See the dramatic difference in quality:
Tesseract OCR Result:
~ Trust me, OD tite WE loca mS
Gemini OCR Result:Trust me, Oakmonters know a newcomer when they see one. We locals can tell.
Tesseract OCR Result:
' Paulie: Driv: show, Tom. Next stop's Bi the motel. 7 jj ie
Gemini OCR Result:Paulie: Drive before the cops show, Tom. Next stop's Bill at the motel.
-
Professional Results: Handles low-contrast text, stylized fonts, and dynamic backgrounds that confuse traditional OCR
-
Cost-Effective Excellence: ~$0.00004 per subtitle screenshot - 37.5 times cheaper than Google Cloud Vision API while delivering superior results
-
Unique Gaming Translation Solution: First-of-its-kind integration of premium AI OCR with real-time game subtitle translation
-
Special Cost Estimation: Dedicated API Usage tab with rough cost estimates and comprehensive usage monitoring
-
- Extended Context Window: Expanded sliding history window from 2 to 5 previous subtitles for enhanced translation quality
- Better Context Awareness: Improved narrative coherence and grammatical consistency across longer conversations
- Enhanced Asian Language Support: Extended context particularly beneficial for languages that rely heavily on contextual understanding
- Gemini 2.5 Flash-Lite Integration: Revolutionary AI-powered translation with advanced context awareness and cost-effectiveness
- Context-aware translation with configurable sliding window (0-2 previous subtitles) for narrative coherence
- Intelligent OCR error correction that automatically fixes garbled input text
- Exceptional cost-effectiveness: translate massive games like The Witcher 3 for pennies
- Built-in real-time cost tracking with token usage analytics and cumulative cost monitoring
- Detailed API call logging with complete transparency (
Gemini_API_call_logs.txt
) - Advanced file caching system (
gemini_cache.txt
) for reduced API costs - Superior translation quality with context understanding for dialogue flow and character consistency
- DeepL Free Usage Tracker: Monitor your monthly free quota consumption with real-time tracking in the Settings tab
- Displays current usage against the 500,000 character monthly limit for DeepL API Free accounts
- Helps users optimize their free tier usage
- Screen Area Selection: Define custom regions for text capture and translation display
- Real-time Translation: Automatically detects and translates text as it changes
- Multiple Translation Engines:
- Gemini 2.5 Flash-Lite API
- MarianMT (offline neural machine translation)
- DeepL API
- Google Translate API
- Multilingual User Interface: Full support for English and Polish interface languages
- Floating Overlays: Translucent, movable windows that stay on top of other applications
- Customizable Appearance: Adjust colours, fonts, and transparency
- Image Preprocessing: Various modes to improve OCR accuracy
- Hotkey Support: Control the application without switching windows
- Translation Caching: Reduce API calls and improve performance
🎮 Perfect for Gamers and Non-Technical Users!
If you want to start using Game-Changing Translator immediately without installing Python or dealing with dependencies, we've prepared a ready-to-use compiled version for you:
- Download both files.
- Run the .exe installer file to unpack and install the application to your preferred folder.
- Install Tesseract OCR (one-time requirement).
- Launch the application by running
GameChangingTranslator.exe
from your installation folder. - Experience premium AI OCR and enhanced context translation! 🤖
📖 Installation Guides:
The compiled versions include everything you need - no Python installation required!
- Windows operating system
- Tesseract OCR installed
- Python 3.7 or newer
-
Clone this repository:
git clone https://github.com/tomkam1702/OCR-Translator.git
-
Install required Python packages:
pip install -r requirements.txt
-
Run the application:
python main.py
For detailed installation instructions, see the Installation Guide.
- Launch the application
- Click "Select Source Area (OCR)" and drag to select the text area you want to translate
- Click "Select Target Area (Translation)" and drag to select where you want the translation to appear
- Configure your preferred translation method in the Settings tab
- Click "Start" to begin translation
- Press the
~
key to toggle translation on/off
For more detailed usage instructions, see the User Manual.
- User Manual
- Installation Guide
- Game-Changing Translator Gallery
- Troubleshooting
- Developer Guide
This project is considered feature-complete. Small changes may or may not be made in the future, but generally no active development is planned. If you wish to add features or make changes, the best approach is to fork the repository and develop it further yourself.
This project is free software, licensed under the GNU General Public Licence version 3 (GPLv3).
You can:
- Use the software for any purpose
- Change the software to suit your needs
- Share the software and your changes with others
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY. See the LICENCE file for complete details.
- Tesseract OCR
- MarianMT
- Google Cloud Translation API
- DeepL API
- Developed with the support of the following AI models: Claude 3.7 Sonnet, Claude Sonnet 4 and Gemini 2.5 Pro
Please note that this project is considered feature-complete. If you wish to make substantial changes, please consider forking the repository instead.
⚠️ FORKING NOTICE: This project requires attribution to the original author. Please read ATTRIBUTION.md before forking or using this code.