Skip to content

Tool for reformatting poorly-cropped webtoon-style pages (continuous vertical comic strips) into more intelligently stacked images, with the ability to export the new images to cbz, cbr, and pdf files.

License

Notifications You must be signed in to change notification settings

eskutcheon/ManhwaFormatter

Repository files navigation

ManhwaFormatter: Webtoon Image Formatter and Archive Generator

ManhwaFormatter is a Python tool for processing vertical webtoon-style comics (like manhwa, manhua, and Korean webcomics) into images which fixes poor crops and cut-off panel content in high-information regions. It also allows for the creation of structured document formats such as .cbz, .cbr, and .pdf using these reconstituted images. It intelligently resizes, segments, and recomposes webtoon panels for optimized viewing across various platforms - from comic book readers to e-readers and print-ready PDFs.

This started as a simple script for my own use with scraped manhwa that were really poorly formatted. I've extended the scripts since then to more intelligently create paginated images to form pdf files and other paginated document types.

Features

  • Auto-resize images to a uniform width using the most common value across a folder
    • Skips resizing landscape-oriented title pages or special pages
  • Intelligent vertical segmentation by detecting blank horizontal regions with low variance
    • Prevents cutting through panels or dialogue bubbles
  • Image directory to document format conversion
    • New paginated mode for .pdf output optimized for page sizes like Letter or A4
    • Supports vertical stacking into long-scroll .cbz or .cbr comics
  • Adaptive spacing logic:
    • Reduces or expands blank space between panels
    • Matches padding color with detected panel gaps
  • Quick and Efficient Processing
    • Handles duplicate images of different file extensions by default
    • Efficient streamed batch processing
    • Fast dimension detection using image metadata parsing

Use Cases

Use Case Description
Mobile readers Create .cbz or .cbr files optimized for vertical scrolling
E-reader export Convert webcomics to paginated .pdf with letter/A4 page sizes
Print layout Generate standardized PDFs for print-ready formatting
Localization / fansubbing Re-segment and recompose webcomics before translation overlays

Example Results

The following examples show the problem of poor image cuts on one of the most popular manhwa of all time, "Solo Leveling" hosted on one of the most popular manga/manhwa/manhua readers of all time, MangaDex. Even with the intense popularity of both, we see that proper formatting of the images may be an afterthought for many, as they instead rely on features of the comic reader to support continuous vertical read modes. The images below are ordered left-to-right in their original read order.

OLD IMAGES

While the images above illustrate the problem, note that we can expect much worse readability when the cuts are made in the middle of a speech bubble or region of very dense comic panels. Below, we see the fixed version that ensures cuts are made in whitespace.

NEW IMAGES

This particular trial was done with a minimum height of 3200, though this can be set arbitrarily with the arguments shown in the Usage section.

Installation

If you're new to programming, Python is generally very easy to setup and walks through the process in their official documentation.

If you want to scrape a whole manga/manhwa/manhua or individual chapters to re-stitch, I recommend installing gallery-dl and using it to scrape from sites like MangaFox. I may consider a webapp for this in the future, but for now it's mainly for transforming files locally.

git clone https://github.com/eskutcheon/ManhwaFormatter.git
cd ManhwaFormatter
pip install -r requirements.txt

Usage

python main.py path/to/inputs path/to/outputs [options]

Arguments

The most pertinent options for a user interested in controlling aspects of image stitching and document creation are given below:

Option         Description                                                                          
--archive     One of ("cbz", "cbr", "pdf")                                                        
--cleanup     Delete all intermediate images after archiving (does nothing if --archive is None)
--min_height Minimum height before a new (non-paginated) stacked page is created - Default: 1600  
--cleanup Flag to automatically delete the new output_dir and its contents after archive creation
--compression The compression level (0-9) for CBR/CBZ/PDF (default: 0 - no compression)
--target_width Target width in pixels for resizing images (optional)
--dpi DPI used for PDF output - Default: 300

Additional options for advanced users can be seen by running

python main.py --help   # this will be revised as a more typical module structure eventually

Example

python main.py ./chapter_001 ./output --min_height 3200
python main.py ./chapter_001 ./output --archive cbz --cleanup
python main.py ./chapter_002 ./output --archive pdf --page_size a4

Roadmap

  • EPUB support
  • GUI wrapper

About

Tool for reformatting poorly-cropped webtoon-style pages (continuous vertical comic strips) into more intelligently stacked images, with the ability to export the new images to cbz, cbr, and pdf files.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages