pdf2mind is an intelligent tool powered by large language models (LLMs) that automatically converts lengthy PDF documents into well-structured mind maps with a single click. It supports output formats including XMind, FreeMind, and SVG.
This project supports both Windows and Linux environments. macOS has not been tested yet.
To set up the environment, run the following commands:
conda create --name pdf2mind python=3.12
conda activate pdf2mind
pip install -r requirements.txt -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
Other required dependency:
For linux OS system:
$ apt install graphviz
For Mac OS system:
$ brew install graphviz
For Windows OS system:
Please refer to https://graphviz.org/download/
$ python pdf2mind.py -h
usage: pdf2mind.py [-h] --pdf PDF --model MODEL --language LANGUAGE (--use-doubao | --use-qwen | --use-openai) [--chunk-size CHUNK_SIZE]
[--overlap-size OVERLAP_SIZE] [--max-level MAX_LEVEL] [--temperature TEMPERATURE]
[--only-freemind | --only-xmind | --only-svg]
Command-line parser: PDF filename + Model selection
options:
-h, --help show this help message and exit
--pdf PDF PDF filename
--model MODEL model name
--language LANGUAGE Target language (e.g., 'English', 'Chinese', 'France', etc.)
--use-doubao Use Doubao model
--use-qwen Use Qwen model
--use-openai Use OpenAI model
--chunk-size CHUNK_SIZE
chunk size of PDF (optional, default 30000)
--overlap-size OVERLAP_SIZE
overlap size of PDF (optional, default 1000)
--max-level MAX_LEVEL
maximum level for mind maps (optional, default: 4)
--temperature TEMPERATURE
LLM temperature (optional, default: 0.7)
--only-freemind Only generate FreeMind (.mm) format
--only-xmind Only generate XMind (.md) format
--only-svg Only generate SVG (.svg) format
Vendor | Required ENV Variable |
---|---|
OpenAI | OPENAI_API_KEY |
Qwen | DASHSCOPE_API_KEY |
Doubao | ARK_API_KEY |
Software | Format |
---|---|
XMind | .md |
FreeMind | .mm |
SVG | .svg |
Here is an example using the Doubao large language model:
$ setx ARK_API_KEY ***key*** # On Windows
$ export ARK_API_KEY=***key*** # On Linux/macOS
$ python pdf2mind.py --pdf testdata/GreenAI-2page.pdf --language Chinese --use-doubao --model doubao-1-5-lite-32k-250115
After successful execution, mind maps in all three supported formats will be generated in the source directory.
$ docker build -t hyongtao-db/pdf2mind:0.0.1 .
$ docker run \
hyongtao-db/pdf2mind:0.0.1 \
-h
usage: pdf2mind.py [-h] --pdf PDF --model MODEL --language LANGUAGE
(--use-doubao | --use-qwen | --use-openai)
[--chunk-size CHUNK_SIZE] [--overlap-size OVERLAP_SIZE]
[--max-level MAX_LEVEL] [--temperature TEMPERATURE]
[--only-freemind | --only-xmind | --only-svg]
Command-line parser: PDF filename + Model selection
$ docker run \
-e ARK_API_KEY=$ARK_API_KEY \
-v $(pwd)/:/data/ \
hyongtao-db/pdf2mind:0.0.1 \
--pdf /data/testdata/GreenAI-2Page.pdf \
--language Chinese \
--use-doubao \
--model doubao-1-5-lite-32k-250115
Please pay attention to the correspondence of Docker file mounts.
-
Highest Priority
- ✅ Implement asynchronous I/O
- ✅ Design a more comprehensive class structure
- ✅ Add logging functionality
- ✅ Add .gitignore
- ✅ Optimize configuration parameters, including: model temperature, PDF chunk/overlap length, maximum depth of mind map, etc.
-
Lower Priority
- Switch to
poetry
for dependency management - ✅ Add Docker support
- Provide a Flask-based frontend API service
- Add unit tests using
pytest
- Add GitHub workflow automation
- Support more models and output formats
- Switch to
There’s so much more I’d love to implement. Once these tasks are completed, I plan to revisit foundational knowledge on large language models and Python project best practices.
- I’ve learned a great deal from yihong0618’s xiaogpt and bilingual_book_maker.
- The project ChatPaper2Xmind has also been a great inspiration.