Long time until generation starts when using big context

When just saying like "Hello, who are you?", I get like 200ms/token and it starts generating almost instantly.
On the other hand, when I paste a small text (e.g. search results from duck duck go api) I have to wait +- 1min and then it generates but quite slow. Is this normal behaviour=

My cpu is a ryzen 7 6800h and 32gb ddr5 ram. I'm running vicuna 7b. 
I paste the search result context from the python bindings. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Long time until generation starts when using big context #865

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Long time until generation starts when using big context #865

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions