Skip to content

catswe/cute-kernels

 
 

Repository files navigation

Discord Server

Join the discord server if you are interested in LLM architecture or distributed training/inference research.

Efficient GPU kernels written in both CUDA and Triton

Cute Inductor

CuteInductor allows easier injection of kernels contained in this repository into any PyTorch module.

About

A bunch of kernels that might make stuff slower 😉

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 77.0%
  • Cuda 16.6%
  • C++ 5.6%
  • Other 0.8%