A FastMCP server for Moondream, an AI vision language model. This server provides image analysis capabilities including captioning, visual question answering, object detection, and visual pointing through the Model Context Protocol (MCP).
- πΌοΈ Image Captioning: Generate short, normal, or detailed captions for images
- β Visual Question Answering: Ask natural language questions about images
- π Object Detection: Detect and locate specific objects with bounding boxes
- π Visual Pointing: Get precise coordinates of objects in images
- π URL Support: Process images from both local files and remote URLs
- β‘ Batch Processing: Analyze multiple images efficiently
- π Device Optimization: Automatic detection and optimization for CPU, CUDA, and MPS (Apple Silicon)
- Python 3.10 or higher
- PyTorch 2.0+ (with appropriate device support)
# Run without installation
uvx moondream-mcp
# Or specify a specific version
uvx moondream-mcp==1.0.2pip install moondream-mcpgit clone https://github.com/ColeMurray/moondream-mcp.git
cd moondream-mcp
pip install -e .git clone https://github.com/ColeMurray/moondream-mcp.git
cd moondream-mcp
pip install -e ".[dev]"# Using uvx (no installation needed)
uvx moondream-mcp
# Using pip-installed command
moondream-mcp
# Or run directly with Python
python -m moondream_mcp.serverAdd to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"moondream": {
"command": "uvx",
"args": ["moondream-mcp"],
"env": {
"MOONDREAM_DEVICE": "auto"
}
}
}
}{
"mcpServers": {
"moondream": {
"command": "moondream-mcp",
"env": {
"MOONDREAM_DEVICE": "auto"
}
}
}
}The server can be configured using environment variables:
MOONDREAM_MODEL_NAME: Model name (default:vikhyatk/moondream2)MOONDREAM_MODEL_REVISION: Model revision (default:2025-01-09)MOONDREAM_TRUST_REMOTE_CODE: Trust remote code (default:true)
MOONDREAM_DEVICE: Force specific device (cpu,cuda,mps, orauto)
MOONDREAM_MAX_IMAGE_SIZE: Maximum image dimensions (default:2048x2048)MOONDREAM_MAX_FILE_SIZE_MB: Maximum file size in MB (default:50)
MOONDREAM_TIMEOUT_SECONDS: Processing timeout (default:120)MOONDREAM_MAX_CONCURRENT_REQUESTS: Max concurrent requests (default:5)MOONDREAM_ENABLE_STREAMING: Enable streaming for captions (default:true)MOONDREAM_MAX_BATCH_SIZE: Maximum batch size for batch operations (default:10)MOONDREAM_BATCH_CONCURRENCY: Concurrent batch processing limit (default:3)MOONDREAM_ENABLE_BATCH_PROGRESS: Enable progress reporting for batch operations (default:true)
MOONDREAM_REQUEST_TIMEOUT_SECONDS: HTTP request timeout (default:30)MOONDREAM_MAX_REDIRECTS: Maximum HTTP redirects (default:5)MOONDREAM_USER_AGENT: HTTP User-Agent header
Generate captions for images.
Parameters:
image_path(string): Path to image file or URLlength(string): Caption length -"short","normal", or"detailed"stream(boolean): Whether to stream caption generation
Example:
{
"image_path": "https://example.com/image.jpg",
"length": "detailed",
"stream": false
}Ask questions about images.
Parameters:
image_path(string): Path to image file or URLquestion(string): Question to ask about the image
Example:
{
"image_path": "/path/to/image.jpg",
"question": "How many people are in this image?"
}Detect specific objects in images.
Parameters:
image_path(string): Path to image file or URLobject_name(string): Name of object to detect
Example:
{
"image_path": "https://example.com/photo.jpg",
"object_name": "person"
}Get coordinates of objects in images.
Parameters:
image_path(string): Path to image file or URLobject_name(string): Name of object to locate
Example:
{
"image_path": "/path/to/image.jpg",
"object_name": "car"
}Multi-purpose image analysis tool.
Parameters:
image_path(string): Path to image file or URLoperation(string): Operation type ("caption","query","detect","point")parameters(string): JSON string with operation-specific parameters
Example:
{
"image_path": "https://example.com/image.jpg",
"operation": "query",
"parameters": "{\"question\": \"What is the weather like?\"}"
}Process multiple images in batch.
Parameters:
image_paths(string): JSON array of image pathsoperation(string): Operation to perform on all imagesparameters(string): JSON string with operation-specific parameters
Example:
{
"image_paths": "[\"image1.jpg\", \"image2.jpg\"]",
"operation": "caption",
"parameters": "{\"length\": \"short\"}"
}# Using the caption_image tool
result = await caption_image(
image_path="https://example.com/sunset.jpg",
length="detailed"
)# Ask about image content
result = await query_image(
image_path="/path/to/family_photo.jpg",
question="How many children are in this photo?"
)# Detect faces in an image
result = await detect_objects(
image_path="https://example.com/group_photo.jpg",
object_name="face"
)# Process multiple images
result = await batch_analyze_images(
image_paths='["img1.jpg", "img2.jpg", "img3.jpg"]',
operation="caption",
parameters='{"length": "normal"}'
)The server automatically detects and optimizes for available hardware:
- Optimal performance on M1/M2/M3 Macs
- Automatic memory management
- Native acceleration
- GPU acceleration for NVIDIA cards
- Automatic CUDA memory management
- Mixed precision support
- Works on any system
- Optimized for multi-core processing
- Lower memory requirements
The server provides detailed error information:
{
"success": false,
"error_message": "Image file not found: /path/to/missing.jpg",
"error_code": "IMAGE_PROCESSING_ERROR",
"processing_time_ms": 15.2
}Common error codes:
MODEL_LOAD_ERROR: Issues loading the Moondream modelIMAGE_PROCESSING_ERROR: Problems with image files or URLsINFERENCE_ERROR: Model inference failuresINVALID_REQUEST: Invalid parameters or requests
- Use appropriate image sizes: Resize large images before processing
- Batch processing: Use
batch_analyze_imagesfor multiple images - Device optimization: Let the server auto-detect the best device
- Concurrent requests: Adjust
MOONDREAM_MAX_CONCURRENT_REQUESTSbased on your hardware - Memory management: Monitor memory usage, especially with large images
# Check PyTorch installation
python -c "import torch; print(torch.__version__)"
# Check device availability
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, MPS: {torch.backends.mps.is_available()}')"- Reduce
MOONDREAM_MAX_IMAGE_SIZE - Lower
MOONDREAM_MAX_CONCURRENT_REQUESTS - Use CPU instead of GPU for large images
- Check firewall settings for URL access
- Increase
MOONDREAM_REQUEST_TIMEOUT_SECONDS - Verify SSL certificates for HTTPS URLs
pytest tests/# Format code
black src/ tests/
# Sort imports
isort src/ tests/
# Type checking
mypy src/- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Run quality checks
- Submit a pull request
This project is licensed under the MIT License. See LICENSE for details.
- Moondream - The amazing vision language model
- FastMCP - The MCP server framework
- Model Context Protocol - The protocol specification
- π Documentation
- π Issue Tracker
- π¬ Discussions
Note: This server requires downloading the Moondream model on first use, which may take some time depending on your internet connection.