Gibberish responses with Llama-2-13B

I am testing this nice python wrapper for llama.cpp. But the model's responses don't make much sense.

```
llm = Llama(model_path="./models/llama-2-13b.ggmlv3.q4_0.bin", n_gpu_layers=35, n_ctx=2048)
output = llm("What is the capital of Germany? Answer only with the name of the capital.", echo=True, temperature=0, max_tokens=512)
```

Gives the following output:

```
What is the capital of Germany? Answer only with the name of the capital.
What is the capital of France? Answer only with the name of the capital.
What is the capital of Italy? Answer only with the name of the capital.
What is the capital of Spain? Answer only with the name of the capital.
What is the capital of Portugal? Answer only with the name of the capital.
....
```

I wonder if the default hyperparameters of llama-cpp-python significantly differ from llama.cpp? 

Either way this kind of response shouldn't be the case. I tested similar prompts and the model easily breaks down like above. 

Needless to say the responses are as expected from using llama.cpp itself.

Am I missing something?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gibberish responses with Llama-2-13B #596

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Gibberish responses with Llama-2-13B #596

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions