Skip to content
Change the repository type filter

All

    Repositories list

    • LaCache

      Public
      [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models.
      Python
      0900Updated Jul 22, 2025Jul 22, 2025
    • DiffCR

      Public
      Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers
      Python
      0810Updated May 19, 2025May 19, 2025
    • [CVPR 2025] "Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training" by Lexington Whalen, Zhenbang Du, Haoran You, Chaojian Li, Sixu Li, and Yingyan (Celine) Lin.
      Python
      0200Updated May 5, 2025May 5, 2025
    • LongMamba

      Public
      A training-free method for extending the context length of SSMs (State Space Models) and hybrid architectures..
      Python
      01010Updated Apr 26, 2025Apr 26, 2025
    • An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.
      Python
      01300Updated Feb 3, 2025Feb 3, 2025
    • [ECCV 2024 Oral] "Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields" by Yonggan Fu, Huaizhi Qu, Zhifan Ye, Chaojian Li, Kevin Zhao, and Yingyan (Celine) Lin.
      Python
      0810Updated Dec 14, 2024Dec 14, 2024
    • AmoebaLLM

      Public
      [NeurIPS 2024] "AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment" by Yonggan Fu, Zhongzhi Yu, Junwei Li, Jiayi Qian, Yongan Zhang, Xiangchi Yuan, Dachuan Shi, Roman Yakunin, and Yingyan (Celine) Lin.
      Python
      31400Updated Dec 13, 2024Dec 13, 2024
    • ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
      Python
      1611050Updated Oct 15, 2024Oct 15, 2024
    • Python
      94700Updated Oct 8, 2024Oct 8, 2024
    • LLM4HWDesign Starting Toolkit
      Python
      41710Updated Oct 4, 2024Oct 4, 2024
    • ACT

      Public
      [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
      Python
      14120Updated Jun 30, 2024Jun 30, 2024
    • Edge-LLM

      Public
      [DAC 2024] EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting
      Python
      96320Updated Jun 30, 2024Jun 30, 2024
    • [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
      Python
      33310Updated Jun 12, 2024Jun 12, 2024
    • [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
      Python
      13010Updated Mar 14, 2024Mar 14, 2024
    • NeRFool

      Public
      [ICML 2023] "NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations" by Yonggan Fu, Ye Yuan, Souvik Kundu, Shang Wu, Shunyao Zhang, Yingyan (Celine) Lin
      Python
      11800Updated Mar 10, 2024Mar 10, 2024
    • CPT

      Public
      [ICLR 2021 Spotlight] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, and Yingyan (Celine) Lin.
      Python
      63121Updated Mar 2, 2024Mar 2, 2024
    • [NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
      Python
      03110Updated Dec 6, 2023Dec 6, 2023
    • C
      0600Updated Oct 19, 2023Oct 19, 2023
    • BNS-GCN

      Public
      [MLSys 2022] "BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling" by Cheng Wan, Youjie Li, Ang Li, Nam Sung Kim, Yingyan Lin
      Python
      125600Updated Oct 6, 2023Oct 6, 2023
    • S3-Router

      Public
      [NeurIPS 2022] "Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing" by Yonggan Fu, Yang Zhang, Kaizhi Qian, Zhifan Ye, Zhongzhi Yu, Cheng-I Lai, Yingyan Lin
      Python
      21710Updated Sep 19, 2023Sep 19, 2023
    • ViTCoD

      Public
      [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
      Python
      1211520Updated Jun 27, 2023Jun 27, 2023
    • Hint-Aug

      Public
      Python
      0500Updated Jun 25, 2023Jun 25, 2023
    • [ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark
      Python
      1911120Updated Apr 18, 2023Apr 18, 2023
    • HALO

      Public
      The official code for [ECCV2020] "HALO: Hardware-aware Learning to Optimize"
      Python
      01000Updated Mar 22, 2023Mar 22, 2023
    • PipeGCN

      Public
      [ICLR 2022] "PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication" by Cheng Wan, Youjie Li, Cameron R. Wolfe, Anastasios Kyrillidis, Nam Sung Kim, Yingyan Lin
      Python
      73300Updated Mar 15, 2023Mar 15, 2023
    • ViTALiTy

      Public
      ViTALiTy (HPCA'23) Code Repository
      Python
      62320Updated Mar 13, 2023Mar 13, 2023
    • Spline-EB

      Public
      [TMLR] Max-Affine Spline Insights Into Deep Network Pruning
      Python
      0100Updated Nov 12, 2022Nov 12, 2022
    • 71100Updated Oct 27, 2022Oct 27, 2022
    • [ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures
      Python
      62510Updated Oct 1, 2022Oct 1, 2022
    • NASA

      Public
      [ICCAD 2022] NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks
      Python
      01000Updated Sep 22, 2022Sep 22, 2022