Change the repository type filter
All
Repositories list
181 repositories
qwen-image
Publicpyannote-speaker-diarization-3.1
Public templateA state-of-the-art model that segments and labels audio recordings by accurately distinguishing different speakers. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>facebook-bart-cnn
Public templateA variant of the BART model designed specifically for natural language summarization. It was pre-trained on a large corpus of English text and later fine-tuned on the CNN/Daily Mail dataset. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>- 30.5B MoE code generation model purpose-tuned for code generation and agentic tool use. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
gpt-oss-20b
Public templateA 21B open‑weight language model (with ~3.6 billion active parameters per token) developed by OpenAI for reasoning, tool integration, and low‑latency usage. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>voxtral-mini-3b
Public template3B parameter audio-language model with speech transcription, translation, and audio understanding capabilities. <metadata> gpu: A10 | collections:["HF_Transformers"] </metadata>kyutai-tts-1.6b
Public template1.6B parameter text-to-speech model that supports real-time streaming text input with ultra-low latency and voice conditioning capabilities.<metadata> gpu: A10 | collections:["HF_Transformers"] </metadata>llama-3.1-8b-instruct-gguf
Public templateAn 8B-parameter, instruction-tuned variant of Meta's Llama-3.1 model, optimized in GGUF format for efficient inference. <metadata> gpu: A100 | collections: ["lama.cpp"] </metadata>stable-diffusion-3-5-large-turbo
Public templateA fast, optimized diffusion model that generates high-quality images from text prompts, ideal for creative visual content. <metadata> gpu: A100 | collections: ["Diffusers"] </metadata>jina-embeddings-v4
Public templateA 3.8B multimodal-multilingual embedding that unifies text and image understanding in a single late-interaction space, delivers both dense and multi-vector outputs. <metadata> gpu: A10 | collections: ["HF_Transformers"] </metadata>flux-1-kontext-dev
Public template12B model from Black Forest Labs that allows in‑context image editing with character and style consistency; supporting iterative, instruction-guided edits. <metadata> gpu: A100 | collections: ["HF_Transformers"] </metadata>gemma-3n-e4b-it
Public template8B variant of the lightweight Gemma 3n series that operates with a 4B‑parameter memory footprint, enabling full multimodal inference (text, image, audio, video) on resource‑constrained hardware. <metadata> gpu: A100 | collections: ["HF_Transformers"] </metadata>qwen3-embedding-0.6b
Public template600M parameter, 100 language embedding model that turns up to 32k token inputs into instruction-aware vectors. <metadata> gpu: A10 | collections: ["HF_Transformers"] </metadata>devstral-small
Public templateAn agentic LLM for software engineering tasks, excels at using tools to explore codebases, editing multiple files and power software engineering agents. <metadata> gpu: A100 | collections: ["HF_Transformers"] </metadata>deepseek-r1-qwen3-8b
Public templateA distilled 8B parameter reasoning powerhouse, leveraging deep chain‑of‑thought from the DeepSeek R1‑0528—delivering SOTA open‑source performance. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>nanonets-ocr-s
Public templateNanonets-OCR-s that turns images or PDFs into structured Markdown capturing tables, LaTeX, captions and tags—for fast, powerful, human-readable OCR. <metadata> gpu: A10 | collections: ["HF_Transformers"] </metadata>Open-NotebookLM
Publicyolo11m-detect
Publickokoro
Public template82M parameters lightweight text-to-speech (TTS) model that delivers high-quality voice synthesis. <metadata> gpu: T4 | collections: ["SSE Events"] </metadata>qwen3-14b
Public template14B model with hybrid approach to problem-solving with two distinct modes: "thinking mode," which enables step-by-step reasoning and "non-thinking mode," designed for rapid, general-purpose responses. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>qwen2.5-omni-7b
Public templateAn advanced end-to-end multimodal which can processes text, images, audio, and video inputs, generating real-time text and natural speech responses. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>qwen3-8b
Public templateQwen3-8B is a language model that supports seamless switching between “thinking” mode-for advanced math, coding, and logical inference-and “non-thinking” mode for fast, natural conversation. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>MCP-Google-Map-Agent
Publicphi-4-multimodal-instruct
Public templateState‑of‑the‑art multimodal foundation model developed by Microsoft Research which seamlessly fuses robust language understanding with advanced visual and audio analysis. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>- 8B model, excels in producing high-quality, detailed images up to 1 megapixel in resolution. <metadata> gpu: A100 | collections: ["Diffusers"] </metadata>
phi-4-GGUF
Public templateA 14B model optimized in GGUF format for efficient inference, designed to excel in complex reasoning tasks. <metadata> gpu: A100 | collections: ["llama.cpp","GGUF"] </metadata>tinyllama-1-1b-chat-v1-0
Public templateA chat model fine-tuned on TinyLlama, a compact 1.1B Llama model pretrained on 3 trillion tokens. <metadata> gpu: T4 | collections: ["vLLM"] </metadata>llama-2-13b-chat-hf
Public templateA 13B model fine-tuned with reinforcement learning from human feedback, part of Meta’s Llama 2 family for dialogue tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>