Fix slow gguf tests #2846

For-rest2005 · 2025-03-26T15:17:47Z

In this pull request, we add a new API model, "builtin_gguf". It is derived from abetlen/llama-cpp-python#1983
But we can't get a correct results directly, due to some errors on the API of llama-cpp-python. I have post an issue on llama-cpp-python project. In this issue, I give my solution for the specific need in "builtin_gguf" and make some optimization for these APIs. You can modify the code in llama-cpp-python manually according to the issue. But my work can only meet our needs when running within "builtin_gguf". It is not thorough for llama-cpp-python. Thus I do not make a pull request. It is wait for others to solve.
By the way, this API-model is still not complete. Hope you guys can complement it.

CLAassistant · 2025-03-26T15:17:57Z

All committers have signed the CLA.

For-rest2005 and others added 2 commits March 26, 2025 17:15

add builtin_gguf API model

ef98744

Merge branch 'EleutherAI:main' into main

160e680

For-rest2005 requested review from baberabb and StellaAthena as code owners March 26, 2025 15:17

Update builtin_gguf.py

c3ee3ba

For-rest2005 mentioned this pull request Apr 10, 2025

Slow when logits_all=True, inconsistent logprobs and solutions abetlen/llama-cpp-python#1983

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix slow gguf tests #2846

Fix slow gguf tests #2846

Uh oh!

For-rest2005 commented Mar 26, 2025

Uh oh!

CLAassistant commented Mar 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Fix slow gguf tests #2846

Are you sure you want to change the base?

Fix slow gguf tests #2846

Uh oh!

Conversation

For-rest2005 commented Mar 26, 2025

Uh oh!

CLAassistant commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

CLAassistant commented Mar 26, 2025 •

edited

Loading