A curated list of Model Zoos & Hubs where you can find production-ready and optimized models for resource-constrained devices.
Model Zoo | Description | Links |
---|---|---|
Edge AI Labs Model Zoo | A collection of pre-trained, optimized models for low-power devices. | EdgeAI Labs |
Edge Impulse Model Zoo | A repository of models optimized for edge devices. | Edge Impulse Model Zoo |
ONNX Model Zoo | A collection of pre-trained, state-of-the-art models in the ONNX format. | ONNX Model Zoo |
NVIDIA Pretrained AI Models (NGC + TAO) | Accelerate AI development with world-class customizable pretrained models from NVIDIA. | - NVIDIA Pretrained AI Models - Main - NGC Model Catalog - TAO Model Zoo |
OpenVINO Model Zoo | A collection of pre-trained models ready for use with Intel's OpenVINO toolkit. | OpenVINO Model Zoo |
Qualcomm Models Zoo | A collection of AI models from Qualcomm. | Qualcomm Models Zoo |
LiteRT Pre-trained models | Pre-trained models optimized for Google's Lite Runtime. | LiteRT Pre-trained Models |
Keras Applications | Pre-trained models for Keras applications | Keras Pre-trained Models |
MediaPipe | Framework for building multimodal applied machine learning pipelines. | MediaPipe |
TensorFlow Model Garden | A repository with a collection of TensorFlow models. | TensorFlow Model Garden |
Pytorch Model Zoo | A hub for pre-trained models on PyTorch framework. | Pytorch Model Zoo |
stm32ai-modelzoo | AI Model Zoo for STM32 microcontroller devices. | stm32ai-modelzoo |
Model Zoo | A collection of pre-trained models for various machine learning tasks. | Model Zoo |
Hugging Face Models | A collection of pre-trained models for various machine learning tasks. | Hugging Face Models |
Papers with Code | A repository that links academic papers to their respective code and models. | Papers with Code |
MXNet Model Zoo | A collection of pre-trained models for the Apache MXNet framework. | MXNet Model Zoo |
Deci’s Model Zoo | A curated list of high-performance deep learning models. | Deci’s Model Zoo |
Jetson Model Zoo and Community Projects | NVIDIA's collection of models and projects for Jetson platform. | Jetson Model Zoo and Community Projects |
Magenta | Models for music and art generation from Google's Magenta project. | Magenta |
Awesome-CoreML-Models Public | A collection of CoreML models for iOS developers. | Awesome-CoreML-Models Public |
Pinto Models | A variety of models for computer vision tasks. | Pinto Models |
Baidu AI Open Model Zoo | Baidu's collection of AI models. | Baidu AI Open Model Zoo |
Hailo Model Zoo | A set of models optimized for Hailo's AI processors. | Hailo Model Zoo |
This is a non-exhaustive selection of models from several platforms listed in Section 1, ranged into six domains and a variety of tasks, with a focus on efficiency and real-world applications.
Domain | Task | Model | Description | Reference |
---|---|---|---|---|
Computer Vision | Object Detection | yolov8_det | Object detection for edge devices | YOLOv8 on GitHub |
Image Classification | mobilenet_v3_small | Lightweight image classification | MobileNetV3 on TensorFlow Hub | |
Semantic Segmentation | deeplabv3_resnet50 | Semantic image segmentation | DeepLabV3 on TensorFlow Hub | |
Instance Segmentation | yolov8_seg | Object detection and segmentation | YOLOv8 on GitHub | |
Object Tracking | DeepSort | Real-time object tracking | DeepSort on GitHub | |
Pose Estimation | openpose | Human pose estimation | OpenPose on GitHub | |
Facial Recognition | mediapipe_face | Face detection and recognition | MediaPipe Face on Google AI | |
Optical Character Recognition | trocr | Text recognition in images | TrOCR on Hugging Face | |
Video Classification | resnet_2plus1d | Video classification for action recognition | ResNet-2+1D on PyTorch Hub | |
Video Classification | resnet_3d | 3D CNN for video classification | ResNet-3D on PyTorch Hub | |
Audio Processing | Speech-to-Text | distil-whisper | Lightweight speech recognition model | Distil-Whisper on Hugging Face |
Sound Classification | audio-spectrogram-transformer | Transformer for audio classification | AST on Hugging Face | |
Voice Activity Detection | silero-vad | Voice activity detection for edge devices | Silero VAD on GitHub | |
Acoustic Scene Classification | panns | Audio tagging and scene classification | PANNS on GitHub | |
Speaker Diarization | pyannote-audio | Speaker diarization and segmentation | PyAnnote on Hugging Face | |
Speech Recognition | wav2vec2 | Self-supervised speech representation learning | Wav2vec2 on Hugging Face | |
- | - | - | - | |
- | - | - | - | |
- | - | - | - | |
- | - | - | - | |
Time Series | Predictive Maintenance | tsmixer | Time-series forecasting for maintenance | TimesFM on GitHub |
Anomaly Detection | IsolationForest | Anomaly detection in time-series data | IsolationForest on Scikit-learn | |
Forecasting | informer | Transformer-based time-series forecasting | Informer on GitHub | |
Time-Series Classification | rocket | Efficient time-series classification | ROCKET on GitHub | |
Image Super-Resolution | real_esrgan_x4plus | Image super-resolution for temporal data | Real-ESRGAN on GitHub | |
Image Inpainting | lama_dilated | Image inpainting for time-series analysis | LaMa on GitHub | |
- | - | - | - | |
- | - | - | - | |
- | - | - | - | |
- | - | - | - | |
NLP | Speech Recognition | Whisper | General-purpose speech recognition model | Whisper on Hugging Face |
Keyword Spotting | silero-kws | Wake word detection for edge devices | Silero Models on GitHub | |
Text Classification | distilbert | Lightweight transformer for text classification | DistilBERT on Hugging Face | |
Named Entity Recognition | bert-ner | NER for entity extraction | BERT-NER on Hugging Face | |
Question Answering | mobilebert | Lightweight QA model for edge | MobileBERT on Hugging Face | |
Text Summarization | bart | Text summarization for short texts | BART on Hugging Face | |
- | - | - | - | |
- | - | - | - | |
- | - | - | - | |
- | - | - | - | |
Generative AI | Image Generation & Synthesis | ControlNet | Fine control over image generation | ControlNet on GitHub |
Stable Diffusion | Text-to-image generation | Stable Diffusion on Hugging Face | ||
stylegan2 | Image generation | StyleGAN2 on GitHub | ||
Flux.1-schnell | Fast text-to-image generation | Flux.1 on Hugging Face, Awesome-Smol | ||
Reve | Image generation with advanced text rendering | Reve on Hugging Face | ||
- | - | - | ||
- | - | - | ||
Small Language Model (SLM) | SmolLM2-1.7B | Small language model for efficient text generation | SmolLM2 on Hugging Face, Awesome-Smol | |
Gemma 2 | Lightweight open model for text generation | Gemma 2 on Hugging Face, Awesome-Smol | ||
Phi-3.5-mini | Small language model with strong reasoning | Phi-3.5-mini on Hugging Face, Awesome-Smol | ||
Qwen2.5-1.5B | Efficient language model for instruction following | Qwen2.5 on Hugging Face, Awesome-Smol | ||
Mixtral-8x22B | Sparse mixture of experts for text generation | Mixtral on Hugging Face | ||
- | - | - | ||
- | - | - | ||
Multimodality | SmolVLM-256M | Smallest vision-language model for image understanding | SmolVLM-256M on Hugging Face, Awesome-Smol | |
SmolVLM-500M | Vision-language model for image and text tasks | SmolVLM-500M on Hugging Face, Awesome-Smol | ||
BakLLaVA-1 | Multimodal model for text and image tasks | BakLLaVA-1 on Hugging Face, Awesome-Smol | ||
PaliGemma | Vision-language model for multimodal tasks | PaliGemma on Hugging Face | ||
Seed1.5-VL | Vision-language model with strong multimodal performance | Seed1.5-VL on Hugging Face | ||
- | - | - | ||
- | - | - | ||
Misc | Sensor Fusion | mediapipe_pose | Human pose estimation using sensor data | MediaPipe Pose on Google AI |
Activity Recognition | har-cnn | Human activity recognition from sensor data | HAR-CNN on GitHub | |
Contextual Awareness | SmolVLM-256M | Multimodal model for environment understanding | SmolVLM-256M on Hugging Face, Awesome-Smol | |
Network Anomaly Detection | LOF | Local outlier factor for network anomalies | LOF on Scikit-learn | |
Device Behavior Anomaly | Autoencoder | Anomaly detection for device behavior | Keras Autoencoder | |
Sensor Data Anomaly | OC-SVM | One-class SVM for sensor data anomalies | OneClassSVM on Scikit-learn | |
On-device Control Systems | TD3 | Twin Delayed DDPG for control systems | TD3 on GitHub | |
Various | pinecone | Vector database | - | |
Various | weaviate-c2 | Vector database | - | |
Various | upstage | Various models | Awesome-Smol |
Selecting the right model for edge deployment is critical for balancing performance, accuracy and efficiency.
- Efficiency: Edge devices (e.g., IoT, mobile, embedded systems) have limited compute, memory, and power.
- Performance: Real-time applications (e.g., autonomous drones, smart cameras) demand low latency and high accuracy.
- Scalability: The right model ensures cost-effective deployment across devices.
- Task Requirements: Match the model to your application (e.g., vision, audio, multimodal).
- Hardware Constraints: Consider compute (OPS), memory (MB), and energy (mWh) limits of your device.
- Performance Goals: Balance accuracy, latency, and throughput for your use case.
- Deployment Ease: Check compatibility with frameworks (e.g., TensorFlow Lite, ONNX).
Next Steps: Once you’ve shortlisted a model, use the Edge AI Benchmarking Guide to profile and optimize the model performance.