Skip to content

hauke96/wiki2book

Repository files navigation

wiki2book

wiki2book is a tool to create good-looking eBooks from one or more Wikipedia articles.

The goal is to create eBooks (EPUB files) as beautiful as real books from a given list of Wikipedia articles. To achieve this, wiki2book contains specific treatments of Wikipedia- and website-specific content of the articles and therefore provides different results than general converters (more on this below). This should make reading Wikipedia articles even more fun and may create a whole new readership for this awesome and imperceptibly large database of knowledge.

eBook of the German article about astronomy on a Tolino eBook-reader:

Why not simply use pandoc?

Good question.

Pandoc and other converters, like wb2pdf or percollate, are great and yes, they can convert mediawiki to EPUB. In fact, wiki2book by default uses pandoc to turn HTML into EPUB because pandoc does this quite well.

However, when converting mediawiki to EPUB, a lot of things are missing when using these tools: Correct rendering math code, downloading and embedding images, evaluating templates, proper handling of tables, ...

Furthermore, these tools are generic and don't do any eBook-specific assumptions, e.g. ignoring eBook-unsuitable CSS-styles or excluding Wikipedia-oriented templates.

Another feature missing in all of these tools: You cannot turn multiple articles into a single ready-to-read eBook. This also includes adding a title page, table of content, custom styles, etc.

Wiki2book is a tool addressing all these issues and nice features to generate beautiful looking eBooks.

Installation

Usage

Currently only a CLI (command line interface) version of wiki2book exists, so no GUI. Wiki2book uses configuration files, project files and CLI arguments to be configured. See the documentation for further information including a list of all options or use the --help flag for an overview.

Preliminaries

You need the following tools and fonts when using the default configuration:

  • ImageMagick (to have the magick command).
  • Pandoc (when using the pandoc output driver). See notes on pandoc versions 2 and 3 below.
  • rsvg (to have the rsvg-convert command).
  • Only applies to Linux systems: DejaVu fonts in /usr/share/fonts/TTF/DejaVuSans*.ttf, which are referenced in the default style. If these files should be embedded into the eBook, use the font-files config entry, which is empty by default.

The usage of external tools can be configured, e.g. to use explicit paths to executables, to use completely different tools or to use a custom script. See doc/configuration for further details.

CLI

The CLI contains three sub-commands that generate an EPUB file from different sources:

  1. Project: wiki2book project ./path/to/project.json
  2. Article: wiki2book article "article name"
  3. Standalone: wiki2book standalone ./path/to/file.mediawiki

Use wiki2book -h for more information and wiki2book <command> -h for information on a specific command.

Configuration

See the config documentation.

Pandoc version 2 and 3

Only relevant when using the pandoc output driver.

Pandoc version 2 might internally use CSS3 parameters by default, such as the gap property. This might cause problems on certain eBook readers (e.g. Tolino ones). To overcome this, pass the argument --pandoc-data-dir ./pandoc/data to wiki2book, which uses an HTML-template from this repo that's not using the problematic gap property.

Alternatively install pandoc 3, which avoids CSS3 parameters.

Contribute

Issues, bugs, ideas

Feel free to open a new issue and filling out the issue-template.

Please keep in mind:

  1. This is a hobby-project and my time is limited.
  2. Things that are of little or no use to me personally will be given low/no priority.

Development and code contributions

For building, running, testing, etc. take a look at src/README.md.

There's no established process for code contributions. Please open an issue, describe your ideas, how you plan to implement this and we'll discuss further steps.

Long-term goals

  • Create a public API and web app (#7)
  • Ask Wikipedia if they want to embed/link to this tool in any way (that would be super cool :D)

About

A simple CLI tool to create good-looking eBooks from Wikipedia articles.

Topics

Resources

License

Stars

Watchers

Forks