Support llama.cpp

I can't get ollama to work with gpu accelleration, so I'm using llama.cpp which has a Nix flake that worked perfectly (once I understood "cuda" was the cuda version and not the cuda library) :heart_eyes: 

It looks like llama.cpp has a different api so I can't just use `(gptel-make-ollama`. This sound correct?

Then again I see something about [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) having an "OpenAI-like API". The downside of this being I'll have to package llama-cpp-python for nix

Maybe I can use that and gptel somehow? Just looking for a bit of guidance, but will tinker around when I get time and try things. If I find anything useful I'll report back here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support llama.cpp #121

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Support llama.cpp #121

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions