Skip to content

easy-llama wishlist #10

@oobabooga

Description

@oobabooga

I'm interested in potentially replacing llama-cpp-python with easy-llama in my project, and have some questions about feature parity:

  1. Are all parameters in the params dict below available?

https://github.com/oobabooga/text-generation-webui/blob/096272f49e55357a364ed9016357b97829dae0fd/modules/llamacpp_model.py#L88

  1. Is it possible to get the logits after a certain input? As done here:

https://github.com/oobabooga/text-generation-webui/blob/096272f49e55357a364ed9016357b97829dae0fd/modules/llamacpp_model.py#L134

  1. Similar to 2. but more nuanced: is there a way to get the logits for every token position in an input at once? In llama-cpp-python, this is done by passing logits_all=True while loading the model, which reduces performance but makes all logits available as a matrix when you get them with model.eval_logits. I have used this feature to measure the perplexity of llama.cpp quants a while ago using the code here:

https://github.com/oobabooga/text-generation-webui/blob/096272f49e55357a364ed9016357b97829dae0fd/modules/llamacpp_hf.py#L133

  1. I have a llamacpp_HF wrapper that connects llama.cpp to HF text generation functions; at its core, all it does is update model.n_tokens to do prefix matching, and evaluate new tokens by calling model.eval taking as input a list containing the new tokens only. Can that be done with easy-llama? See:

https://github.com/oobabooga/text-generation-webui/blob/096272f49e55357a364ed9016357b97829dae0fd/modules/llamacpp_hf.py#L118

  1. Is speculative decoding implemented? There is a PR here https://github.com/oobabooga/text-generation-webui/pull/6669/files to add it, and having it in easy-llama would be great, especially if it could be done in a simple way by just passing new kwargs to its model loading and/or generation functions. I believe doing that for my llamacpp_HF wrapper would be very hard, so that's not something I have hopes for.

If you are interested, a PR changing llama-cpp-python to easy-llama in my repository would be highly welcome once wheels are available. It would be a way to test the library as well. But I can also to try to do the change myself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions