Excessive Size

Hi there, saw your Reddit post and it piqued my interest. Attempted to run it on my Ollama server. However, I notice it has a huge increase in memory size when compared to when the LLM is being ran in natively via `ollama run <model>`. Is this an intended function? I have attached a screenshot showing the difference in memory size and CPU/GPU utilization between the two. I noticed it didn't utilize my GPU as well although Ollama is served with GPU passed to it (as you can see in the native `ollama run` portion, it is able to utilize my GPU).

![image](https://github.com/user-attachments/assets/ce8e45db-47a2-414b-a4c2-34843e10dad4)

I also hit this rate limiting message when attempting to run a nonsense research topic, is that normal?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Excessive Size #74

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Excessive Size #74

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions