Skip to content

Gibberish responses with Llama-2-13B #596

Open
@rlleshi

Description

@rlleshi

I am testing this nice python wrapper for llama.cpp. But the model's responses don't make much sense.

llm = Llama(model_path="./models/llama-2-13b.ggmlv3.q4_0.bin", n_gpu_layers=35, n_ctx=2048)
output = llm("What is the capital of Germany? Answer only with the name of the capital.", echo=True, temperature=0, max_tokens=512)

Gives the following output:

What is the capital of Germany? Answer only with the name of the capital.
What is the capital of France? Answer only with the name of the capital.
What is the capital of Italy? Answer only with the name of the capital.
What is the capital of Spain? Answer only with the name of the capital.
What is the capital of Portugal? Answer only with the name of the capital.
....

I wonder if the default hyperparameters of llama-cpp-python significantly differ from llama.cpp?

Either way this kind of response shouldn't be the case. I tested similar prompts and the model easily breaks down like above.

Needless to say the responses are as expected from using llama.cpp itself.

Am I missing something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    modelModel specific issuequalityQuality of model output

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions