fpgasystems / ternaryLLM Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Code for Fast Ternary Large Language Model Inference with Addition-Based Sparse GEMM on Edge Devices

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
ternaryLLM_CPU		ternaryLLM_CPU
ternaryLLM_FPGA		ternaryLLM_FPGA
ternaryLLM_GPU		ternaryLLM_GPU
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Repository files navigation

ternaryLLM

Code for Fast Ternary Large Language Model Inference with Addition-Based Sparse GEMM on Edge Devices

Initial contribution:

CPU code: Mila and Shien
GPU code: Guanshujie

About

Code for Fast Ternary Large Language Model Inference with Addition-Based Sparse GEMM on Edge Devices

Custom properties

Report repository

Releases

No releases published

Packages

No packages published

Contributors 4

Languages