-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Open
Labels
Nvidia GPUIssues specific to Nvidia GPUsIssues specific to Nvidia GPUsenhancementNew feature or requestNew feature or requesthelp wantedNeeds help from the communityNeeds help from the communityroadmapPart of a roadmap projectPart of a roadmap project
Description
Recently, initial Mamba support (CPU-only) has been introduced in #5328 by @compilade
In order to support running these models efficiently on the GPU, we seem to be lacking kernel implementations for the following 2 ops:
GGML_OP_SSM_CONV
GGML_OP_SSM_SCAN
Creating this issue to keep track of this and give more visibility of this feature. Help with implementing the missing kernels for CUDA and Metal (and other backends potentially) is welcome. We can also discuss if anything else is required to better support this architecture in llama.cpp
gdmcdonald, cdliang11, DifferentialityDevelopment, Green-Sky, mtasic85 and 18 morelin72h, rmusser01 and hg0428lin72h, rmusser01 and benja0x40josh-ramer, rmusser01 and MoonRide303
Metadata
Metadata
Assignees
Labels
Nvidia GPUIssues specific to Nvidia GPUsIssues specific to Nvidia GPUsenhancementNew feature or requestNew feature or requesthelp wantedNeeds help from the communityNeeds help from the communityroadmapPart of a roadmap projectPart of a roadmap project