note : raw code, without proper OOP structuring, but works
- does not use external libraries
- written in pure python
- parses metadata, manifest and spine of epub
- extracts text from epub
- gives chapters path
git clone
the repo
cd epub_parse/epub3
python3 epub.py
to run the script- if your run
epub.py
as script you can read epub or its metadata.
-
You can use the following functions to:
get_opf_path()
- returns opf path from conatiner.xml fileget_opf_data()
- returns package.opf dataget_metadata()
- return metadata (title, author name, identifier)get_manifest()
- return manifest of epubget_spine()
- return spine contentget_chapter_path()
- returns path to all chaptersget_text()
- returns full text of epub
Epub-Usage - The 2 epubs used are listed and were freely available on the web.