-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Open
Labels
good first issueGood for newcomersGood for newcomershelp wantedNeeds help from the communityNeeds help from the communityroadmapPart of a roadmap projectPart of a roadmap project
Description
Project: ggml-org : tutorials
List:
- tutorial : compute embeddings using llama.cpp
- tutorial : parallel inference using Hugging Face dedicated endpoints
- tutorial : KV cache reuse with llama-server
- tutorial : measuring time to first token (TTFT) and time between tokens (TBT)
TODO:
- Is there a way to cache multiple prompt prefixes? #13488
- How to use function calls? #13134
- how to measure time to first token (TTFT) and time between tokens (TBT) #13251
- Apple A-chipsets; how to estimate a suitable model size ? #12742
- How to get started with webui development (ref: tutorials : list for llama.cpp #13523 (comment))
- etc.
Simply search for "How to" in the Discussions: https://github.com/ggml-org/llama.cpp/discussions?discussions_q=is%3Aopen+How+to
Contributions for writing tutorials are welcome!
jeffzhou2000, nmandic78 and simpalaVaibhavs10 and jeffzhou2000ngxson and qnixsynapse
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomershelp wantedNeeds help from the communityNeeds help from the communityroadmapPart of a roadmap projectPart of a roadmap project
Type
Projects
Status
In Progress