Skip to content

llnhnv/manga_read_along

 
 

Repository files navigation

Manga Read-Along Video Creator

Overview

This repository contains a set of Python scripts designed to transform raw manga images and associated JSON data into engaging read-along videos. Inspired by children's voice books, the videos feature character speech bubbles that appear in sync with narration, providing an interactive experience for weebs. (yes I use ChatGPT to generate this)

Features

  • Image Processing: Use Magiv2 to get the transcript
  • Bubble Chat Animation: Creates a dynamic speech bubble sequence corresponding to character dialogues.
  • Video Creation: Converts the processed images into a video format using img2mp4.py, enabling smooth playback of the read-along experience.
  • Voice over: Uses a TTS model to read out the dialogue extracted from the images. The TTS engine converts the text into natural-sounding speech, which is then synchronized with the bubble animations.

Files Included

  • main.py: The main file to run the whole process, is very buggy. But you can check the pipeline there
  • demo-manga-read-along.ipynb: Check the process step-by-step here
  • requirements.txt: Lists the dependencies needed to run the scripts.

Usage

  1. Setup Environment:

    • Make sure you have Python installed.
    • Create a virtual environment (optional but recommended) and install the required packages:
    pip install -r requirements.txt
  2. Prepare Your Data:

    • \src\config.py: Contains all the paths you need to know, this project requires users to provide manga images, character images, and voice bank samples for voice cloning
    • Works best when the raw is in English and they are named in sequential order (e.g., 01.jpg, 02.jpg, 03.jpg), the character naming format should be: luffy_1.jpg, nami_1.jpg.
  3. Process Images and create Video with voice:

    • Check out the kaggle file.
  4. View the Demo:

    • A video demo showcasing the read-along feature can be found in the repository.

Image Demo

page_002_panel_000_bubble_000 page_002_panel_000_bubble_001 page_002_panel_000_bubble_002

Video Demo

No-Colour Full-Page Demo

no_color_fullpage.mp4

Colourized Panel-View Demo

color_panel.mp4

Contribution

Feel free to fork the repository and submit pull requests if you have improvements or suggestions.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 88.5%
  • Python 11.2%
  • Dockerfile 0.3%