AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and many more models on HuggingFace.
-
Updated
Mar 2, 2025 - Python
AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and many more models on HuggingFace.
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models such as YOLO, FastVLM, and more.
ComfyUI wrapper for Moondream's gaze detection
Use the Moondream 2 model to detect faces and their gaze directions in videos.
Moondream is a lightweight multimodal large language model
Python scripts to use for captioning images with VLMs
A powerful video summarization tool that utilizes Moondream alongside multiple AI models to provide comprehensive video understanding through audio transcription, intelligent frame selection, visual description, and content summarization.
Automatically annotates YOLO dataset using Moondream visual model
Modal moondream inference
full text search your meme/screenshot folder with small LMs x traditional OCR
This tool uses Moondream 2B, a powerful yet lightweight vision-language model, to detect and redact objects from videos. Moondream can recognize a wide variety of objects, people, text, and more with high accuracy while being much smaller than most vision models.
Pheye - a family of efficient small vision-language models
Unleashing the power of local vlms with moondream and streamlit
MoonLabel: Moondream VLM labeler for object detection and image captions with one‑click YOLO, COCO, VOC, and Captions export.
Capitalizing moondream's capabilities to build a CCTV frame-on-framer analyzer
Add a description, image, and links to the moondream topic page so that developers can more easily learn about it.
To associate your repository with the moondream topic, visit your repo's landing page and select "manage topics."