Config file not being read #1139

gurbindersingh · 2023-10-05T12:39:48Z

gurbindersingh
Oct 5, 2023

I've set up LocalAI on a Ubuntu 22.04 machine with a Nvidia GPU following the easy setup guide. But it looks like there is no GPU offloading.

I can see the following message in the logs offloading 0/43 layers to the GPU and VRAM used: 0MB.
It took a while to figure out that you need the gpu_layers parameter in the config file since that's not mentioned in the guide. But I've now created the config file in the same directory as the model (official Llama 2 model converted using Llama.cpp). This is what it looks like:

name: llama2-13b-chat

parameters: 
  model: ggml-model-q4_0.gguf
  temperature: 0.1
  f16: true

context_size: 4096
gpu_layers: 43

But it doesn't seem like these configs are being read, even after restarting (and even rebuilding) the container. I am not sure if this is a bug or if I'm doing something wrong since the docs don't mention which of the properties in the configs are required and which are not.

Edit: I can run the model just fine with GPU offloading using Llama.cpp at 70 tokens/s. With LocalAI it's only running at 5-7 tokens/s.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Config file not being read #1139

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Config file not being read #1139

Uh oh!

Uh oh!

gurbindersingh Oct 5, 2023

Replies: 0 comments

gurbindersingh
Oct 5, 2023