A powerful Model Context Protocol (MCP) server that converts 29+ file formats to clean, structured Markdown using Microsoft's MarkItDown library.
π₯ Perfect for Claude Desktop, MCP clients, and AI workflows!
- π MCP Protocol: Seamless integration with Claude Desktop and MCP clients
- π 29+ File Formats: PDFs, Office docs, images, audio, archives, and more
- π Image Metadata: Extract EXIF metadata from images (JPG, PNG, GIF, etc.)
- π΅ Speech Recognition: Convert audio to text with speech transcription (MP3, WAV)*
*Requires markitdown[all]
installation for full functionality
File Type | Required Dependencies | Install Command |
---|---|---|
pypdf , pymupdf , pdfplumber |
pipx inject markitdown-mcp 'markitdown[all]' |
|
Excel (.xlsx, .xls) | openpyxl , xlrd , pandas |
pipx inject markitdown-mcp openpyxl xlrd pandas |
PowerPoint (.pptx) | python-pptx |
Included in base install |
Images | PIL , exiftool (optional) |
Included in base install |
Audio | pydub , speech_recognition |
pipx inject markitdown-mcp 'markitdown[all]' |
Basic formats | None | Base install only |
Note: For the best experience, we recommend installing all dependencies using the Complete Install method below.
- π Office Documents: Word, PowerPoint, Excel files
- π Web Content: HTML, XML, JSON, CSV
- π E-books & Archives: EPUB, ZIP files
- β‘ Fast & Reliable: Built on Microsoft's MarkItDown library
-
Install the server with ALL features:
# One command to install everything pipx install git+https://github.com/trsdn/markitdown-mcp.git && \ pipx inject markitdown-mcp 'markitdown[all]' openpyxl xlrd pandas pymupdf pdfplumber
-
Add to your Claude Desktop config:
{ "mcpServers": { "markitdown": { "command": "markitdown-mcp", "args": [] } } }
-
Restart Claude Desktop and start converting files!
- Convert multiple file formats to Markdown
- Batch processing of entire directories
- Preserves directory structure in output
- Environment variable support via .env file
Convert a single file to Markdown.
{
"name": "convert_file",
"arguments": {
"file_path": "/path/to/document.pdf"
}
}
Get a complete list of supported file formats.
{
"name": "list_supported_formats",
"arguments": {}
}
Convert all supported files in a directory.
{
"name": "convert_directory",
"arguments": {
"input_directory": "/path/to/files",
"output_directory": "/path/to/markdown"
}
}
Category | Extensions | Features |
---|---|---|
π Office | .pdf , .docx , .pptx , .xlsx , .xls |
Full document structure |
πΌοΈ Images | .jpg , .png , .gif , .bmp , .tiff , .webp |
EXIF metadata extraction |
π΅ Audio | .mp3 , .wav |
Speech-to-text transcription |
π Web | .html , .htm , .xml , .json , .csv |
Clean formatting |
π Books | .epub |
Chapter extraction |
π¦ Archives | .zip |
Auto-extract and process |
π Text | .txt , .md , .rst |
Direct conversion |
# Install from local directory
pip install -e /Users/torstenmahr/GitHub/markitdown-mcp
# Or navigate to the directory first
cd /Users/torstenmahr/GitHub/markitdown-mcp
pip install -e .
cd /Users/torstenmahr/GitHub/markitdown-mcp
source venv/bin/activate
pip install -r requirements.txt
After pip installation:
# Start the MCP server (for use with MCP clients)
markitdown-mcp
Or using the development script:
python run_server.py
Install with ALL dependencies in one command:
# Using pipx (recommended)
pipx install git+https://github.com/trsdn/markitdown-mcp.git && \
pipx inject markitdown-mcp 'markitdown[all]' openpyxl xlrd pandas pymupdf pdfplumber pytesseract pydub speechrecognition
# Or download and run the install script
curl -sSL https://raw.githubusercontent.com/trsdn/markitdown-mcp/main/install-all-deps.sh | bash
pip install -e git+https://github.com/trsdn/markitdown-mcp.git
To ensure all file formats are supported, use one of these methods:
# Install the MCP server
pipx install git+https://github.com/trsdn/markitdown-mcp.git
# Install all required dependencies for full functionality
pipx inject markitdown-mcp 'markitdown[all]' # PDF, OCR, Speech
pipx inject markitdown-mcp openpyxl xlrd pandas # Excel support
pipx inject markitdown-mcp pymupdf pdfplumber # Advanced PDF
# Create and activate virtual environment
python -m venv markitdown-env
source markitdown-env/bin/activate # On Windows: markitdown-env\Scripts\activate
# Install with all dependencies in one command
git clone https://github.com/trsdn/markitdown-mcp.git
cd markitdown-mcp
pip install -e ".[all]" # This installs everything!
If you already have the MCP server installed but some formats aren't working:
# Find your installation
which markitdown-mcp # Shows path like /Users/you/.local/bin/markitdown-mcp
# Inject missing dependencies
pipx inject markitdown-mcp 'markitdown[all]' openpyxl xlrd pandas pymupdf pdfplumber
After installation, verify all dependencies are properly installed:
# Test the MCP server
markitdown-mcp --help
# For pipx installations, check injected packages
pipx list --include-injected
Add this to your Claude Desktop claude_desktop_config.json
:
{
"mcpServers": {
"markitdown": {
"command": "markitdown-mcp",
"args": []
}
}
}
Config file locations:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
- Windows:
%APPDATA%\Claude\claude_desktop_config.json
Convert the file ~/Documents/report.pdf to markdown
Convert all files in ~/Downloads/documents/ to markdown
What file formats can you convert to markdown?
If you see errors like:
PdfConverter threw MissingDependencyException
XlsxConverter threw MissingDependencyException
PptxConverter threw BadZipFile
This means some optional dependencies are missing. Follow the Complete Install instructions above.
Some Markdown files with special characters may fail with UnicodeDecodeError
. This is a known limitation in the MarkItDown library.
- "externally-managed-environment" error: Use pipx instead of pip
- Permission denied: Never use sudo with pip; use pipx or virtual environments
- Command not found: Make sure
~/.local/bin
is in your PATH
See KNOWN_ISSUES.md for more details.
No special configuration required. The tool uses the MarkItDown library for document conversion.
# Convert all supported files from input/ to output/
python mdconvert.py
Specify custom input and output directories:
python mdconvert.py --input /path/to/docs --output /path/to/markdown
Convert a single file:
python mdconvert.py --file document.pdf
--input, -i
: Input directory (default:input
)--output, -o
: Output directory (default:output
)--file, -f
: Convert a single file instead of a directory
The MCP server provides three tools:
Convert a single file to Markdown.
- Input: File path or base64 encoded content with filename
- Output: Converted Markdown content
List all supported file formats.
- Output: Categorized list of supported file extensions
Convert all supported files in a directory.
- Input: Input directory path, optional output directory
- Output: Summary of conversion results
markitdown-mcp/
βββ mcp_server.py # MCP protocol server
βββ mdconvert.py # CLI script
βββ run_server.py # Server runner script
βββ mcp_config.json # MCP configuration
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ input/ # Default input directory
βββ output/ # Default output directory
βββ venv/ # Virtual environment
This MCP server leverages Microsoft's MarkItDown library to provide intelligent document conversion:
- π PDFs: Extracts text, tables, and structure
- πΌοΈ Images: Uses OCR to extract text content + EXIF metadata
- π΅ Audio: Converts speech to text transcription (MP3, WAV)
- π Office: Preserves formatting from Word, Excel, PowerPoint
- π HTML: Converts to clean, readable Markdown
- π¦ Archives: Automatically extracts and processes contents
mcp
model-context-protocol
claude-desktop
markdown
document-conversion
pdf
ocr
speech-to-text
markitdown
ai-tools
- Python: 3.10+
- MCP Client: Claude Desktop or compatible MCP client
- Dependencies: Automatically installed via pip
We welcome contributions! Here's how you can help:
# Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/markitdown-mcp.git
cd markitdown-mcp
# Set up development environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
# Test your changes
markitdown-mcp # Test the server works
- π Bug Reports: Found an issue? Report it
- π‘ Feature Requests: Have an idea? Suggest it
- π New File Formats: Add support for more file types
- π Documentation: Improve guides and examples
- π§ͺ Testing: Add tests and improve reliability
- π¨ Code Quality: Refactor and optimize
- Read our Contributing Guide
- Check existing issues
- Fork the repository
- Create a feature branch (
feat/amazing-feature
) - Make your changes with tests
- Submit a pull request
Please read CONTRIBUTING.md for detailed guidelines.
MIT License - see LICENSE file for details.