MangaTranslator

Translate manga/comics speech bubbles using AI (YOLO for detection, LLMs for translation). Features a Gradio Web UI and CLI.

Features

Automatic speech bubble detection & segmentation.
Text removal & cleaning from detected bubbles.
Text extraction & translation via vision-capable LLMs.
Renders translated text onto images with selected fonts.
Web Interface (Gradio) & Command-Line Interface (CLI).

Requirements

Python >= 3.10
YOLO model with mask segmentation (trained for speech bubbles)
Vision-capable LLM (API or local)

Installation

Windows Portable

Download the standalone zip (NVIDIA GPU or CPU) from the releases page.

Includes recommended YOLO model and Komika font pack.

Manual Installation

Clone Repository:

git clone https://github.com/meangrinch/MangaTranslator.git
cd MangaTranslator

Create Virtual Environment (Recommended):

# Create venv
python -m venv venv

# Activate (Windows CMD/PowerShell)
.\venv\Scripts\activate

# Activate (Linux/macOS/Git Bash)
source venv/bin/activate

Install PyTorch:

# Example (CUDA 12.4)
pip install torch==2.6.0+cu124 torchvision==0.21.0+cu124 --extra-index-url https://download.pytorch.org/whl/cu124

# Example (CPU)
pip install torch

Refer to the official PyTorch installation guide for system-specific commands.

Install Dependencies:
```
pip install -r requirements.txt
```

Post-Installation Setup

Download YOLO Model:
- Download the recommended model and place it in the models directory.

Prepare Fonts:

Place font folders (containing .otf/.ttf files) inside fonts/.
Font variants need 'italic' or 'bold' in their filename to be used for emphasis.

Example structure:

fonts/
├── CC Wild Words/
│   ├── CC Wild Words Roman.otf
│   ├── CC Wild Words Italic.otf
│   ├── CC Wild Words Bold.otf
│   └── CC Wild Words Bold Italic.otf
└── Another Font/
    ├── AnotherFont-Regular.ttf
    └── AnotherFont-BoldItalic.ttf

Note: "CC Wild Words" is a common manga translation font.

Setup LLM:
- Supports external providers (Gemini, OpenAI, etc.,) and local models (Ollama, LMStudio, etc.,).
- Web UI: Configure in the "Config" tab (API keys saved locally to config.json).
- CLI: Pass API keys/endpoints as arguments.
Note: Environment variables (e.g., GEMINI_API_KEY) can also be used. See the "Config" tab for details.

Running

Web UI (Gradio):

Use start-webui.bat or run python app.py --open-browser

Note: First launch will take longer to open (~1-2 minutes).

CLI:

# Example (Single - Japanese -> English - Gemini): 
python main.py --input <image_path> --yolo-model <model_path> --provider Gemini --gemini-api-key <key>

# Example (Batch - Custom Language - Ollama): 
python main.py --input <folder_path> --batch --yolo-model <model_path> --font-dir <custom_font_dir> --input-language <custom_language> --output-language <custom_language> --provider OpenAI-Compatible --openai-compatible-url <url> --output <custom_output_folder>

# See all options: 
python main.py --help

Basic Usage (Web UI)

Launch the Web UI.
Use the "Translator" (single image) or "Batch" (multiple) tab.
Upload manga/comic page image(s).
Select Font, Source Language, Target Language.
Go to "Config" tab:
- Set Translation -> LLM Provider, Model, API Key/Endpoint.
- Set Detection -> Reading Direction (rtl/ltr).
- Click "Save Config" (Optional).
Return to the previous tab and click "Translate" / "Start Batch Translating".
Output is saved to ./output/ by default.

Note: A "cleaning only" mode is also available in the "Other" sub-tab.

Updating

Navigate to the MangaTranslator directory and run:

git pull

Customization

Place custom YOLO models (.pt/.onnx) in models/ (if using web UI). Must support segmentation and be trained for speech bubbles.

License

Apache-2.0. See LICENSE.

Author

grinnch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MangaTranslator

Features

Requirements

Installation

Windows Portable

Manual Installation

Post-Installation Setup

Running

Web UI (Gradio):

CLI:

Basic Usage (Web UI)

Updating

Customization

License

Author

About

Uh oh!

Releases 5

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
core		core
fonts		fonts
models		models
ui		ui
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt
start-webui.bat		start-webui.bat
style.css		style.css

License

meangrinch/MangaTranslator

Folders and files

Latest commit

History

Repository files navigation

MangaTranslator

Features

Requirements

Installation

Windows Portable

Manual Installation

Post-Installation Setup

Running

Web UI (Gradio):

CLI:

Basic Usage (Web UI)

Updating

Customization

License

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 2

Languages

Packages