Skip to content

jeho-lee/Awesome-On-Device-AI-Systems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 

Repository files navigation

Awesome research works for on-device AI systems

A curated list of research works on efficient on-device AI systems, methods, and applications for mobile and edge devices.

Note: Some of the works are designed for inference acceleration on cloud/server infrastructure, which has much higher computational resources, but I also include them here if they can be potentially generalized to on-device inference use cases.

Attention Operation Acceleration

LLM Inference Acceleration on Mobile SoCs

Hardware-aware Quantization

Compiler-based ML Optimization

Inference Acceleration using Heterogeneous Computing Processors (e.g., CPU, GPU, NPU, etc.)

Adaptive Inference for Optimized Resource Utilization

On-device Training, Model Adaptation

Profilers

By Conference (2025~)

MLSys 2025
ASPLOS 2025
  • [Fast On-device LLM Inference with NPUs]
  • Energy-aware Scheduling and Input Buffer Overflow Prevention for Energy-harvesting Systems
  • Generalizing Reuse Patterns for Efficient DNN on Microcontrollers
  • Nazar: Monitoring and Adapting ML Models on Mobile Devices
EuroSys 2025
  • Flex: Fast, Accurate DNN Inference on Low-Cost Edges Using Heterogeneous Accelerator Execution
  • T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
SOSP 2025
MobiSys 2025
  • ARIA: Optimizing Vision Foundation Model Inference on Heterogeneous Mobile Processors for Augmented Reality
MobiCom 2025
Preprint 2025
  • HeteroLLM: Accelerating Large Language Model Inference on Mobile SoCs with Heterogeneous AI Accelerators