Skip to content

Support llama.cpp #121

@ParetoOptimalDev

Description

@ParetoOptimalDev

I can't get ollama to work with gpu accelleration, so I'm using llama.cpp which has a Nix flake that worked perfectly (once I understood "cuda" was the cuda version and not the cuda library) 😍

It looks like llama.cpp has a different api so I can't just use (gptel-make-ollama. This sound correct?

Then again I see something about llama-cpp-python having an "OpenAI-like API". The downside of this being I'll have to package llama-cpp-python for nix

Maybe I can use that and gptel somehow? Just looking for a bit of guidance, but will tinker around when I get time and try things. If I find anything useful I'll report back here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions