Announcing the Edge LLM Leaderboard – Now Live with Support from Hugging Face! #10865
Arnav0400
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
We’re thrilled to introduce the Edge LLM Leaderboard – a platform to benchmark Compressed LLMs on real edge hardware, starting with the Raspberry Pi 5 (8GB) powered by the ARM Cortex A76 CPU and optimized using llama.cpp.
Link - https://huggingface.co/spaces/nyunai/edge-llm-leaderboard
🔑 Key Highlights
🔹 Real-World Performance Metrics:
We focus on critical metrics that matter for edge deployments:
• Prefill Latency (Time to First Token)
• Decode Latency (Generation Speed)
• Model Size (Efficiency for limited storage)
🔹 130+ Models at Launch:
We’ve benchmarked sub-8B models with ARM-optimized quantizations like:
• Q8_0
• Q4_K_M
• Q4_0_4_4 (ARM Neon Optimized)
This provides a comprehensive, real-world comparison of throughput, latency, and memory utilization on accessible, low-cost devices.
🔮 What’s Next?
📈 Expanded Backend Support: Adding frameworks with ARM compatibility.
🖥️ Additional Edge Hardware: Exploring underutilized devices for LLM deployment.
📩 Share your ideas or model requests at: edge-llm-evaluation@nyunai.com
Beta Was this translation helpful? Give feedback.
All reactions