Skip to content

Code for Fast Ternary Large Language Model Inference with Addition-Based Sparse GEMM on Edge Devices

License

Notifications You must be signed in to change notification settings

fpgasystems/ternaryLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ternaryLLM

Code for Fast Ternary Large Language Model Inference with Addition-Based Sparse GEMM on Edge Devices

Initial contribution:

  • CPU code: Mila and Shien
  • GPU code: Guanshujie

About

Code for Fast Ternary Large Language Model Inference with Addition-Based Sparse GEMM on Edge Devices

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •