High-performance GitBook to PDF converter with parallel processing, resume capability, merge option, and smart filtering! β‘
git clone https://github.com/tsoodo/gitbook2pdf ./gb2pdf
cd gb2pdf
bun install
# Interactive mode
bun pdf
# Direct conversion
bun pdf --url https://careers.gitbook.com/
# Generate single merged PDF
bun pdf --url https://careers.gitbook.com/ --merge
- β‘ Lightning Fast: Parallel processing with configurable concurrency
- π Resume Support: Continue interrupted conversions
- π Merge Option: Combine all pages into single PDF
- π― Smart Filtering: Include/exclude patterns with regex support
- β¨οΈ Interactive Controls: Control conversion while running (q/r/o)
- π Progress Tracking: Real-time progress with detailed statistics
- π¨ Quality Options: Multiple quality presets (low/medium/high)
- π Format Support: A4, A3, and Letter formats
- ποΈ Auto Organization: Categorizes PDFs into folders
- π§ Robust Error Handling: Retry logic with exponential backoff
- π± Element Hiding: Removes navigation for clean PDFs
- π Performance Monitoring: Tracks conversion speed and file sizes
- Bun (latest version recommended)
If you don't specify the --merge
flag, the converter will ask you:
π PDF Output Options:
1. Individual PDFs (organized by category)
2. Single merged PDF (all content in one file)
Would you like to create a single merged PDF? (y/N):
While the conversion is running, you can use these keyboard shortcuts:
q
- Quit gracefully (saves progress)r
- Restart conversion from beginningo
- Open output folder in file managerCtrl+C
- Force quit
# Interactive mode with prompts
bun pdf
# Direct URL conversion
bun pdf --url https://careers.gitbook.com/
# Specify output directory
bun pdf -u https://careers.gitbook.com/ -o ./my-pdfs
# High-performance conversion
bun pdf \
--url https://careers.gitbook.com/ \
--concurrency 8 \
--quality high \
--format A3
# Resume previous conversion
bun pdf --url https://careers.gitbook.com/ --resume
# Generate merged PDF
bun pdf --url https://careers.gitbook.com --merge --quality high
# Selective conversion with filters
bun pdf \
--url https://careers.gitbook.com \
--include ".*/api/.*" \
--exclude ".*/internal/.*" \
--exclude ".*/deprecated/.*"
# Custom configuration
bun pdf \
--url https://github.com/tsoodo/gitbook2pdf \
--concurrency 6 \
--delay 500 \
--retries 5 \
--timeout 45000 \
--no-hide-elements
Option | Short | Default | Description |
---|---|---|---|
--url |
-u |
- | GitBook URL (required) |
--output |
-o |
./pdfs |
Output directory |
--concurrency |
-c |
4 |
Concurrent PDF processes |
--retries |
-r |
3 |
Retry attempts for failed pages |
--delay |
-d |
1000 |
Delay between requests (ms) |
--hide-elements |
- | true |
Hide navigation elements |
--format |
- | A4 |
PDF format (A4/A3/Letter) |
--quality |
- | medium |
PDF quality (low/medium/high) |
--resume |
- | false |
Resume previous conversion |
--include |
- | [] |
Include URL patterns (regex) |
--exclude |
- | [] |
Exclude URL patterns (regex) |
--timeout |
- | 30000 |
Request timeout (ms) |
--merge |
- | false |
Merge all pages into single PDF |
--help |
-h |
- | Show help message |
pdfs/
βββ π getting-started/
β βββ 001_installation.pdf
β βββ 002_quick-start.pdf
βββ π api/
β βββ 003_authentication.pdf
β βββ 004_endpoints.pdf
β βββ 005_examples.pdf
βββ π guides/
β βββ 006_advanced-usage.pdf
βββ .progress.json # Resume data
pdfs/
βββ merged-gitbook.pdf # Single merged PDF
βββ .progress.json # Resume data
# Convert entire documentation site
bun pdf --url https://careers.gitbook.com --concurrency 8 --quality high
# Convert only API docs
bun pdf --url https://docs.snyk.io/snyk-api --include ".*/snyk-api/.*" --format A3
# Create single PDF for easy sharing
bun pdf --url https://docs.zenml.io/ --merge --quality high
# Quick conversion for offline reading
bun pdf --url https://developer.thunderbird.net/ --quality medium
# Automated PDF generation in CI
bun pdf --url $DOCS_URL --output ./dist/pdfs --no-hide-elements
- Speed: 3-8x faster than sequential processing
- Memory: Optimized for large documentation sites
- Concurrency: Handles 50+ pages efficiently
- Resume: Zero data loss on interruption
- Increase concurrency for powerful machines:
--concurrency 10
- Use low quality for drafts:
--quality low
- Filter unnecessary pages with
--exclude
- Enable resume for large sites:
--resume
- Use merge for single document:
--merge
--include ".*/guides/.*" --include ".*/api/.*"
--exclude ".*/admin/.*" --exclude ".*/internal/.*"
--include ".*/v2/.*" --exclude ".*/v2/deprecated/.*"
- puppeteer - Headless Chrome for PDF generation
- xml2js - XML sitemap parsing
- Bun APIs - File I/O, HTTP, and process management
Timeout Errors
# Increase timeout for slow pages
bun pdf --url https://docs.example.com --timeout 60000
Memory Issues
# Reduce concurrency for limited memory
bun pdf --url https://docs.example.com --concurrency 2
Failed Pages
# Increase retries for unstable connections
bun pdf --url https://docs.example.com --retries 5
Resume Conversion
# Continue where you left off
bun pdf --url https://docs.example.com --resume
For troubleshooting, failed pages generate error screenshots:
pdfs/category/001_page_error.png
After conversion, gb2pdf shows detailed statistics:
- β Successful conversions
- β Failed attempts
- π Total file size
- β±οΈ Processing time
- π Average metrics
- No data sent to external services
- All processing happens locally
- No storage of GitBook credentials
- Respects robots.txt and rate limits
MIT License Β© 2025 Ian Irizarry