Using Node LLama CPP in a Child Process #481

SpicyMelonYT · 2025-07-14T02:00:56Z

SpicyMelonYT
Jul 14, 2025

Hello!

I have the need for loading a model in a child process. Since I have been having issues with that I striped my code down into literally just the importing:

import {
  getLlama,
  LlamaChatSession,
  Llama,
  LlamaModel,
  LlamaContext,
} from "node-llama-cpp";

This is the only thing in the child.js file that is being spawned like this:

this._child = spawn("node", [CHILD_PATH], {
  stdio: ["inherit", "pipe", "pipe", "ipc"],
  detached: false,
});

where CHILD_PATH comes from:

const CHILD_PATH = resolve(__dirname, "./child.js");

The child process hangs when I do this. I set it up to where it will print a line before and after the import statement, (turns out imports are handled before console.log statements) so I put it inside a lazy load and what I see is that it prints the first line before the import but not the second line after the import.

Can anyone shed some light on this? Why does it not work?

My Goal:

The reason I want it in a child process is because unloading a model programmatically has been a pain. I don't know how it has been over the last few months but a few months ago I had concluded through github forum research that the best way to unload the model is having it inside a worker and just killing the thread. However, I opted to moving to a child process instead of a worker thread because the worker thread would freeze the main thread upon loading a model. Where as the child process shouldn't given the nature of child processes being non blocking.

So while I would love to be able to get this working in a child process, I also even more so would love to just have a way to not block the main process while loading, and the be able to unload a model instantly and in a guaranteed way!

Answered by giladgd

Jul 14, 2025

Please share the output of running this command, so I can get a sense of your environment:

npx --yes node-llama-cpp inspect gpu

You can await model.dispose() to unload a model on demand, or just let it be garbage collected to get it unloaded automatically.
The loading of the model is also asynchronous, but it may take about ~100ms of the main thread to read the model’s metadata from the JS side if the metadata is extremely big, but this only happens once.

Why did you land on using another process/thread for that?
On node-llama-cpp v2 the loading and unloading of models used to be sync, so having another process for that back then made sense, but this is not the case anymore since v3.
Perh…

View full answer

giladgd · 2025-07-14T04:54:55Z

giladgd
Jul 14, 2025
Maintainer

Please share the output of running this command, so I can get a sense of your environment:

npx --yes node-llama-cpp inspect gpu

You can await model.dispose() to unload a model on demand, or just let it be garbage collected to get it unloaded automatically.
The loading of the model is also asynchronous, but it may take about ~100ms of the main thread to read the model’s metadata from the JS side if the metadata is extremely big, but this only happens once.

Why did you land on using another process/thread for that?
On node-llama-cpp v2 the loading and unloading of models used to be sync, so having another process for that back then made sense, but this is not the case anymore since v3.
Perhaps you’ve read that recommendation before v3 was out?

12 replies

giladgd Jul 15, 2025
Maintainer

Thanks for taking the time to run all these tests!

The Vulkan build seems to fail since you don't have the Vulkan SDK installed.

Which model did you use in these tests?

So I think there are 2 different issues here:

The default context size set by node-llama-cpp with the CUDA backend is too big with the model you're using on your hardware, and I'll have to tweak that algorithm a bit more to take more details into account. Same for the Vulkan backend with the combination of the hardware + model.
Freezes/lags experienced on the machine that loads the models.

To help me with with 1., can you please run this command and share its output with me? (run it inside your project to take advantage of the new builds of llama.cpp that you've made)

npx --no node-llama-cpp inspect measure --gpu cuda <gguf path>

It'd also help if you can also run this command with other models you have that work without issues, so I can use them as a baseline.

To help me with 2., can you please rerun the tests and report for each one whether you experienced any lag or if the loading was smooth?
It'd be helpful if you can also run the npx --yes node-llama-cpp inspect gpu command before running each test, when the model is loaded, when the context is loaded, and after the process exits and share all of these so I can see how each affects your machine, and whether the lag you experienced has some visible effect on resource usage in a way that maybe I can better control to avoid that lag.

Regarding the child process, I think you can safely discard that approach, since node-llama-cpp should work great on the main process, and if not then I'd want to get to the bottom of why so I can fix it.

Thank you again for helping me debug this and get these issues solved :)

SpicyMelonYT Jul 15, 2025
Author

Oh yes the Vulkan SDK! I forgot that Vulkan would also have an SDK I just use CUDA.

The model I used is llama3.1 8b fine tuned by me with that unsloth collab. Would that muddy the results?

Here is running that command with 3 separate models:

Llama3.1 8b Q4_0

D:\Projects\[REDACTED]\[REDACTED]>npx --no node-llama-cpp inspect measure --gpu cuda "[REDACTED]\model.gguf"
File: [REDACTED]\model.gguf
GPU: CUDA
mmap: enabled

  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
* | Model   | 33     |              | 4.07GB               | 4.07GB     | -2.01MB    (-0.05%) |                        |              |                     | 37.95% (6.07GB/16GB)
  | Error   | 33     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 33     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 33     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 33     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 33     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 33     | 25258        | 4.07GB               | 4.07GB     | -2.01MB    (-0.05%) | 4.9GB                  | 7.04GB       | -2.14GB   (-30.35%) | 81.94% (13.11GB/16GB)
  | Context | 33     | 4096         | 4.07GB               | 4.07GB     | -2.01MB    (-0.05%) | 1.03GB                 | 1.22GB       | -199.98MB (-15.97%) | 45.59% (7.29GB/16GB)
  | Context | 33     | 2048         | 4.07GB               | 4.07GB     | -2.01MB    (-0.05%) | 668.02MB               | 806MB        | -137.98MB (-17.12%) | 42.87% (6.86GB/16GB)
  | Context | 33     | 1024         | 4.07GB               | 4.07GB     | -2.01MB    (-0.05%) | 476.02MB               | 602MB        | -125.98MB (-20.93%) | 41.62% (6.66GB/16GB)
  | Context | 33     | 512          | 4.07GB               | 4.07GB     | -2.01MB    (-0.05%) | 380.02MB               | 530MB        | -149.98MB (-28.30%) | 41.18% (6.59GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 32     |              | 3.67GB               | 3.67GB     | -3MB       (-0.08%) |                        |              |                     | 35.45% (5.67GB/16GB)
  | Error   | 32     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 32     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 32     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 32     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 32     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 32     | 25258        | 3.67GB               | 3.67GB     | -3MB       (-0.08%) | 4.9GB                  | 6.33GB       | -1.43GB   (-22.58%) | 75.02% (12GB/16GB)
  | Context | 32     | 4096         | 3.67GB               | 3.67GB     | -3MB       (-0.08%) | 1.03GB                 | 1.45GB       | -427.98MB (-28.92%) | 44.48% (7.12GB/16GB)
  | Context | 32     | 2048         | 3.67GB               | 3.67GB     | -3MB       (-0.08%) | 668.02MB               | 1.07GB       | -423.98MB (-38.83%) | 42.11% (6.74GB/16GB)
  | Context | 32     | 1024         | 3.67GB               | 3.67GB     | -3MB       (-0.08%) | 476.02MB               | 910MB        | -433.98MB (-47.69%) | 41.00% (6.56GB/16GB)
  | Context | 32     | 512          | 3.67GB               | 3.67GB     | -3MB       (-0.08%) | 380.02MB               | 844MB        | -463.98MB (-54.97%) | 40.60% (6.5GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 31     |              | 3.55GB               | 3.55GB     | -1.53MB    (-0.04%) |                        |              |                     | 34.70% (5.55GB/16GB)
  | Error   | 31     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 31     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 31     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 31     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 31     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 31     | 25258        | 3.55GB               | 3.55GB     | -1.53MB    (-0.04%) | 4.85GB                 | 6.33GB       | -1.47GB   (-23.27%) | 74.24% (11.88GB/16GB)
  | Context | 31     | 4096         | 3.55GB               | 3.55GB     | -1.53MB    (-0.04%) | 1.02GB                 | 1.43GB       | -420.25MB (-28.71%) | 43.64% (6.98GB/16GB)
  | Context | 31     | 2048         | 3.55GB               | 3.55GB     | -1.53MB    (-0.04%) | 663.75MB               | 1.06GB       | -420.25MB (-38.77%) | 41.32% (6.61GB/16GB)
  | Context | 31     | 1024         | 3.55GB               | 3.55GB     | -1.53MB    (-0.04%) | 473.75MB               | 906MB        | -432.25MB (-47.71%) | 40.23% (6.44GB/16GB)
  | Context | 31     | 512          | 3.55GB               | 3.55GB     | -1.53MB    (-0.04%) | 378.75MB               | 842MB        | -463.25MB (-55.02%) | 39.84% (6.37GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 30     |              | 3.44GB               | 3.44GB     | -2.06MB    (-0.06%) |                        |              |                     | 33.97% (5.44GB/16GB)
  | Error   | 30     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 30     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 30     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 30     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 30     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 30     | 25258        | 3.44GB               | 3.44GB     | -2.06MB    (-0.06%) | 4.71GB                 | 6.28GB       | -1.57GB   (-24.98%) | 73.20% (11.71GB/16GB)
  | Context | 30     | 4096         | 3.44GB               | 3.44GB     | -2.06MB    (-0.06%) | 1019.48MB              | 1.41GB       | -428.52MB (-29.59%) | 42.81% (6.85GB/16GB)
  | Context | 30     | 2048         | 3.44GB               | 3.44GB     | -2.06MB    (-0.06%) | 651.48MB               | 1.05GB       | -424.52MB (-39.45%) | 40.54% (6.49GB/16GB)
  | Context | 30     | 1024         | 3.44GB               | 3.44GB     | -2.06MB    (-0.06%) | 467.48MB               | 902MB        | -434.52MB (-48.17%) | 39.48% (6.32GB/16GB)
  | Context | 30     | 512          | 3.44GB               | 3.44GB     | -2.06MB    (-0.06%) | 375.48MB               | 840MB        | -464.52MB (-55.30%) | 39.10% (6.26GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 29     |              | 3.32GB               | 3.32GB     | -2.59MB    (-0.08%) |                        |              |                     | 33.24% (5.32GB/16GB)
  | Error   | 29     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 29     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 29     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 29     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 29     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 29     | 25258        | 3.32GB               | 3.32GB     | -2.59MB    (-0.08%) | 4.56GB                 | 6.18GB       | -1.62GB   (-26.16%) | 71.87% (11.5GB/16GB)
  | Context | 29     | 4096         | 3.32GB               | 3.32GB     | -2.59MB    (-0.08%) | 995.22MB               | 1.4GB        | -436.78MB (-30.50%) | 41.98% (6.72GB/16GB)
  | Context | 29     | 2048         | 3.32GB               | 3.32GB     | -2.59MB    (-0.08%) | 639.22MB               | 1.04GB       | -428.78MB (-40.15%) | 39.76% (6.36GB/16GB)
  | Context | 29     | 1024         | 3.32GB               | 3.32GB     | -2.59MB    (-0.08%) | 461.22MB               | 898MB        | -436.78MB (-48.64%) | 38.72% (6.19GB/16GB)
  | Context | 29     | 512          | 3.32GB               | 3.32GB     | -2.59MB    (-0.08%) | 372.22MB               | 838MB        | -465.78MB (-55.58%) | 38.35% (6.14GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 28     |              | 3.2GB                | 3.2GB      | -3.12MB    (-0.10%) |                        |              |                     | 32.51% (5.2GB/16GB)
  | Error   | 28     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 28     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 28     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 28     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 28     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 28     | 25258        | 3.2GB                | 3.2GB      | -3.12MB    (-0.10%) | 4.42GB                 | 6.09GB       | -1.67GB   (-27.38%) | 70.54% (11.29GB/16GB)
  | Context | 28     | 4096         | 3.2GB                | 3.2GB      | -3.12MB    (-0.10%) | 970.95MB               | 1.38GB       | -445.05MB (-31.43%) | 41.15% (6.58GB/16GB)
  | Context | 28     | 2048         | 3.2GB                | 3.2GB      | -3.12MB    (-0.10%) | 626.95MB               | 1.04GB       | -433.05MB (-40.85%) | 38.98% (6.24GB/16GB)
  | Context | 28     | 1024         | 3.2GB                | 3.2GB      | -3.12MB    (-0.10%) | 454.95MB               | 894MB        | -439.05MB (-49.11%) | 37.96% (6.07GB/16GB)
  | Context | 28     | 512          | 3.2GB                | 3.2GB      | -3.12MB    (-0.10%) | 368.95MB               | 836MB        | -467.05MB (-55.87%) | 37.61% (6.02GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 27     |              | 3.09GB               | 3.09GB     | -2.16MB    (-0.07%) |                        |              |                     | 31.79% (5.09GB/16GB)
  | Error   | 27     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 27     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 27     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 27     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 27     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 27     | 25258        | 3.09GB               | 3.09GB     | -2.16MB    (-0.07%) | 4.27GB                 | 5.99GB       | -1.71GB   (-28.62%) | 69.21% (11.07GB/16GB)
  | Context | 27     | 4096         | 3.09GB               | 3.09GB     | -2.16MB    (-0.07%) | 946.68MB               | 1.37GB       | -453.32MB (-32.38%) | 40.33% (6.45GB/16GB)
  | Context | 27     | 2048         | 3.09GB               | 3.09GB     | -2.16MB    (-0.07%) | 614.68MB               | 1.03GB       | -437.32MB (-41.57%) | 38.21% (6.11GB/16GB)
  | Context | 27     | 1024         | 3.09GB               | 3.09GB     | -2.16MB    (-0.07%) | 448.68MB               | 890MB        | -441.32MB (-49.59%) | 37.22% (5.95GB/16GB)
  | Context | 27     | 512          | 3.09GB               | 3.09GB     | -2.16MB    (-0.07%) | 365.68MB               | 834MB        | -468.32MB (-56.15%) | 36.88% (5.9GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 26     |              | 2.97GB               | 2.97GB     | -1.19MB    (-0.04%) |                        |              |                     | 31.07% (4.97GB/16GB)
  | Error   | 26     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 26     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 26     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 26     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 26     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 26     | 25258        | 2.97GB               | 2.97GB     | -1.19MB    (-0.04%) | 4.13GB                 | 5.89GB       | -1.76GB   (-29.89%) | 67.88% (10.86GB/16GB)
  | Context | 26     | 4096         | 2.97GB               | 2.97GB     | -1.19MB    (-0.04%) | 922.42MB               | 1.35GB       | -461.58MB (-33.35%) | 39.51% (6.32GB/16GB)
  | Context | 26     | 2048         | 2.97GB               | 2.97GB     | -1.19MB    (-0.04%) | 602.42MB               | 1.02GB       | -441.58MB (-42.30%) | 37.44% (5.99GB/16GB)
  | Context | 26     | 1024         | 2.97GB               | 2.97GB     | -1.19MB    (-0.04%) | 442.42MB               | 886MB        | -443.58MB (-50.07%) | 36.47% (5.84GB/16GB)
  | Context | 26     | 512          | 2.97GB               | 2.97GB     | -1.19MB    (-0.04%) | 362.42MB               | 832MB        | -469.58MB (-56.44%) | 36.14% (5.78GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 25     |              | 2.86GB               | 2.86GB     | -2.22MB    (-0.08%) |                        |              |                     | 30.36% (4.86GB/16GB)
  | Error   | 25     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 25     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 25     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 25     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 25     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 25     | 25258        | 2.86GB               | 2.86GB     | -2.22MB    (-0.08%) | 3.99GB                 | 5.79GB       | -1.81GB   (-31.23%) | 66.58% (10.65GB/16GB)
  | Context | 25     | 4096         | 2.86GB               | 2.86GB     | -2.22MB    (-0.08%) | 898.15MB               | 1.34GB       | -469.85MB (-34.35%) | 38.71% (6.19GB/16GB)
  | Context | 25     | 2048         | 2.86GB               | 2.86GB     | -2.22MB    (-0.08%) | 590.15MB               | 1.01GB       | -445.85MB (-43.04%) | 36.68% (5.87GB/16GB)
  | Context | 25     | 1024         | 2.86GB               | 2.86GB     | -2.22MB    (-0.08%) | 436.15MB               | 882MB        | -445.85MB (-50.55%) | 35.74% (5.72GB/16GB)
  | Context | 25     | 512          | 2.86GB               | 2.86GB     | -2.22MB    (-0.08%) | 359.15MB               | 830MB        | -470.85MB (-56.73%) | 35.42% (5.67GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 24     |              | 2.74GB               | 2.75GB     | -3.25MB    (-0.12%) |                        |              |                     | 29.65% (4.74GB/16GB)
  | Error   | 24     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 24     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 24     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 24     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 24     | 50653        | 2.74GB               | 2.75GB     | -3.25MB    (-0.12%) | 7.42GB                 | 11.15GB      | -3.72GB   (-33.39%) | 99.32% (15.89GB/16GB)
  | Context | 24     | 25258        | 2.74GB               | 2.75GB     | -3.25MB    (-0.12%) | 3.84GB                 | 5.7GB        | -1.86GB   (-32.62%) | 65.27% (10.44GB/16GB)
  | Context | 24     | 4096         | 2.74GB               | 2.75GB     | -3.25MB    (-0.12%) | 873.88MB               | 1.32GB       | -478.12MB (-35.36%) | 37.90% (6.06GB/16GB)
  | Context | 24     | 2048         | 2.74GB               | 2.75GB     | -3.25MB    (-0.12%) | 577.88MB               | 1GB          | -450.12MB (-43.79%) | 35.92% (5.75GB/16GB)
  | Context | 24     | 1024         | 2.74GB               | 2.75GB     | -3.25MB    (-0.12%) | 429.88MB               | 878MB        | -448.12MB (-51.04%) | 35.01% (5.6GB/16GB)
  | Context | 24     | 512          | 2.74GB               | 2.75GB     | -3.25MB    (-0.12%) | 355.88MB               | 828MB        | -472.12MB (-57.02%) | 34.70% (5.55GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 23     |              | 2.63GB               | 2.63GB     | -2.28MB    (-0.08%) |                        |              |                     | 28.93% (4.63GB/16GB)
  | Error   | 23     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 23     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 23     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 23     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 23     | 50653        | 2.63GB               | 2.63GB     | -2.28MB    (-0.08%) | 7.13GB                 | 10.95GB      | -3.82GB   (-34.86%) | 97.39% (15.58GB/16GB)
  | Context | 23     | 25258        | 2.63GB               | 2.63GB     | -2.28MB    (-0.08%) | 3.7GB                  | 5.6GB        | -1.91GB   (-34.05%) | 63.95% (10.23GB/16GB)
  | Context | 23     | 4096         | 2.63GB               | 2.63GB     | -2.28MB    (-0.08%) | 849.62MB               | 1.3GB        | -486.38MB (-36.41%) | 37.08% (5.93GB/16GB)
  | Context | 23     | 2048         | 2.63GB               | 2.63GB     | -2.28MB    (-0.08%) | 565.62MB               | 1020MB       | -454.38MB (-44.55%) | 35.15% (5.62GB/16GB)
  | Context | 23     | 1024         | 2.63GB               | 2.63GB     | -2.28MB    (-0.08%) | 423.62MB               | 874MB        | -450.38MB (-51.53%) | 34.26% (5.48GB/16GB)
  | Context | 23     | 512          | 2.63GB               | 2.63GB     | -2.28MB    (-0.08%) | 352.62MB               | 826MB        | -473.38MB (-57.31%) | 33.97% (5.44GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 22     |              | 2.51GB               | 2.52GB     | -1.31MB    (-0.05%) |                        |              |                     | 28.21% (4.51GB/16GB)
  | Error   | 22     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 22     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 22     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 22     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 22     | 50653        | 2.51GB               | 2.52GB     | -1.31MB    (-0.05%) | 6.84GB                 | 10.98GB      | -4.13GB   (-37.64%) | 96.81% (15.49GB/16GB)
  | Context | 22     | 25258        | 2.51GB               | 2.52GB     | -1.31MB    (-0.05%) | 3.55GB                 | 5.51GB       | -1.96GB   (-35.53%) | 62.63% (10.02GB/16GB)
  | Context | 22     | 4096         | 2.51GB               | 2.52GB     | -1.31MB    (-0.05%) | 825.35MB               | 1.29GB       | -494.65MB (-37.47%) | 36.27% (5.8GB/16GB)
  | Context | 22     | 2048         | 2.51GB               | 2.52GB     | -1.31MB    (-0.05%) | 553.35MB               | 1012MB       | -458.65MB (-45.32%) | 34.39% (5.5GB/16GB)
  | Context | 22     | 1024         | 2.51GB               | 2.52GB     | -1.31MB    (-0.05%) | 417.35MB               | 870MB        | -452.65MB (-52.03%) | 33.52% (5.36GB/16GB)
  | Context | 22     | 512          | 2.51GB               | 2.52GB     | -1.31MB    (-0.05%) | 349.35MB               | 824MB        | -474.65MB (-57.60%) | 33.24% (5.32GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 21     |              | 2.4GB                | 2.4GB      | -2.34MB    (-0.10%) |                        |              |                     | 27.50% (4.4GB/16GB)
  | Error   | 21     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 21     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 21     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 21     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 21     | 50653        | 2.4GB                | 2.4GB      | -2.34MB    (-0.10%) | 6.55GB                 | 10.78GB      | -4.23GB   (-39.22%) | 94.90% (15.18GB/16GB)
  | Context | 21     | 25258        | 2.4GB                | 2.4GB      | -2.34MB    (-0.10%) | 3.41GB                 | 5.41GB       | -2GB      (-37.05%) | 61.31% (9.81GB/16GB)
  | Context | 21     | 4096         | 2.4GB                | 2.4GB      | -2.34MB    (-0.10%) | 801.08MB               | 1.27GB       | -502.92MB (-38.57%) | 35.46% (5.67GB/16GB)
  | Context | 21     | 2048         | 2.4GB                | 2.4GB      | -2.34MB    (-0.10%) | 541.08MB               | 1004MB       | -462.92MB (-46.11%) | 33.63% (5.38GB/16GB)
  | Context | 21     | 1024         | 2.4GB                | 2.4GB      | -2.34MB    (-0.10%) | 411.08MB               | 866MB        | -454.92MB (-52.53%) | 32.79% (5.25GB/16GB)
  | Context | 21     | 512          | 2.4GB                | 2.4GB      | -2.34MB    (-0.10%) | 346.08MB               | 822MB        | -475.92MB (-57.90%) | 32.52% (5.2GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 20     |              | 2.29GB               | 2.29GB     | -3.37MB    (-0.14%) |                        |              |                     | 26.79% (4.29GB/16GB)
  | Error   | 20     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 20     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 20     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 20     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 20     | 50653        | 2.29GB               | 2.29GB     | -3.37MB    (-0.14%) | 6.26GB                 | 10.59GB      | -4.33GB   (-40.85%) | 92.98% (14.88GB/16GB)
  | Context | 20     | 25258        | 2.29GB               | 2.29GB     | -3.37MB    (-0.14%) | 3.26GB                 | 5.31GB       | -2.05GB   (-38.61%) | 60.00% (9.6GB/16GB)
  | Context | 20     | 4096         | 2.29GB               | 2.29GB     | -3.37MB    (-0.14%) | 776.82MB               | 1.26GB       | -511.18MB (-39.69%) | 34.65% (5.54GB/16GB)
  | Context | 20     | 2048         | 2.29GB               | 2.29GB     | -3.37MB    (-0.14%) | 528.82MB               | 996MB        | -467.18MB (-46.91%) | 32.87% (5.26GB/16GB)
  | Context | 20     | 1024         | 2.29GB               | 2.29GB     | -3.37MB    (-0.14%) | 404.82MB               | 862MB        | -457.18MB (-53.04%) | 32.05% (5.13GB/16GB)
  | Context | 20     | 512          | 2.29GB               | 2.29GB     | -3.37MB    (-0.14%) | 342.82MB               | 820MB        | -477.18MB (-58.19%) | 31.80% (5.09GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 19     |              | 2.17GB               | 2.17GB     | -2.41MB    (-0.11%) |                        |              |                     | 26.07% (4.17GB/16GB)
  | Error   | 19     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 19     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 19     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 19     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 19     | 50653        | 2.17GB               | 2.17GB     | -2.41MB    (-0.11%) | 5.97GB                 | 10.4GB       | -4.42GB   (-42.54%) | 91.05% (14.57GB/16GB)
  | Context | 19     | 25258        | 2.17GB               | 2.17GB     | -2.41MB    (-0.11%) | 3.12GB                 | 5.22GB       | -2.1GB    (-40.26%) | 58.68% (9.39GB/16GB)
  | Context | 19     | 4096         | 2.17GB               | 2.17GB     | -2.41MB    (-0.11%) | 752.55MB               | 1.24GB       | -519.45MB (-40.84%) | 33.84% (5.41GB/16GB)
  | Context | 19     | 2048         | 2.17GB               | 2.17GB     | -2.41MB    (-0.11%) | 516.55MB               | 988MB        | -471.45MB (-47.72%) | 32.10% (5.14GB/16GB)
  | Context | 19     | 1024         | 2.17GB               | 2.17GB     | -2.41MB    (-0.11%) | 398.55MB               | 858MB        | -459.45MB (-53.55%) | 31.31% (5.01GB/16GB)
  | Context | 19     | 512          | 2.17GB               | 2.17GB     | -2.41MB    (-0.11%) | 339.55MB               | 818MB        | -478.45MB (-58.49%) | 31.07% (4.97GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 18     |              | 2.06GB               | 2.06GB     | -1.44MB    (-0.07%) |                        |              |                     | 25.35% (4.06GB/16GB)
  | Error   | 18     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 18     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 18     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 18     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 18     | 50653        | 2.06GB               | 2.06GB     | -1.44MB    (-0.07%) | 5.68GB                 | 10.2GB       | -4.52GB   (-44.29%) | 89.12% (14.26GB/16GB)
  | Context | 18     | 25258        | 2.06GB               | 2.06GB     | -1.44MB    (-0.07%) | 2.97GB                 | 5.12GB       | -2.15GB   (-41.97%) | 57.36% (9.18GB/16GB)
  | Context | 18     | 4096         | 2.06GB               | 2.06GB     | -1.44MB    (-0.07%) | 728.28MB               | 1.23GB       | -527.72MB (-42.02%) | 33.02% (5.28GB/16GB)
  | Context | 18     | 2048         | 2.06GB               | 2.06GB     | -1.44MB    (-0.07%) | 504.28MB               | 980MB        | -475.72MB (-48.54%) | 31.33% (5.01GB/16GB)
  | Context | 18     | 1024         | 2.06GB               | 2.06GB     | -1.44MB    (-0.07%) | 392.28MB               | 854MB        | -461.72MB (-54.07%) | 30.56% (4.89GB/16GB)
  | Context | 18     | 512          | 2.06GB               | 2.06GB     | -1.44MB    (-0.07%) | 336.28MB               | 816MB        | -479.72MB (-58.79%) | 30.33% (4.85GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 17     |              | 1.94GB               | 1.95GB     | -2.47MB    (-0.12%) |                        |              |                     | 24.64% (3.94GB/16GB)
  | Error   | 17     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 17     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 17     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 17     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 17     | 50653        | 1.94GB               | 1.95GB     | -2.47MB    (-0.12%) | 5.39GB                 | 10.01GB      | -4.62GB   (-46.11%) | 87.21% (13.95GB/16GB)
  | Context | 17     | 25258        | 1.94GB               | 1.95GB     | -2.47MB    (-0.12%) | 2.83GB                 | 5.02GB       | -2.2GB    (-43.73%) | 56.04% (8.97GB/16GB)
  | Context | 17     | 4096         | 1.94GB               | 1.95GB     | -2.47MB    (-0.12%) | 704.02MB               | 1.21GB       | -535.98MB (-43.22%) | 32.21% (5.15GB/16GB)
  | Context | 17     | 2048         | 1.94GB               | 1.95GB     | -2.47MB    (-0.12%) | 492.02MB               | 972MB        | -479.98MB (-49.38%) | 30.58% (4.89GB/16GB)
  | Context | 17     | 1024         | 1.94GB               | 1.95GB     | -2.47MB    (-0.12%) | 386.02MB               | 850MB        | -463.98MB (-54.59%) | 29.83% (4.77GB/16GB)
  | Context | 17     | 512          | 1.94GB               | 1.95GB     | -2.47MB    (-0.12%) | 333.02MB               | 814MB        | -480.98MB (-59.09%) | 29.61% (4.74GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 16     |              | 1.83GB               | 1.83GB     | -3.5MB     (-0.19%) |                        |              |                     | 23.94% (3.83GB/16GB)
  | Error   | 16     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 16     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 16     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 16     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 16     | 50653        | 1.83GB               | 1.83GB     | -3.5MB     (-0.19%) | 5.1GB                  | 9.82GB       | -4.71GB   (-48.01%) | 85.29% (13.65GB/16GB)
  | Context | 16     | 25258        | 1.83GB               | 1.83GB     | -3.5MB     (-0.19%) | 2.68GB                 | 4.93GB       | -2.24GB   (-45.55%) | 54.72% (8.76GB/16GB)
  | Context | 16     | 4096         | 1.83GB               | 1.83GB     | -3.5MB     (-0.19%) | 679.75MB               | 1.2GB        | -544.25MB (-44.46%) | 31.41% (5.02GB/16GB)
  | Context | 16     | 2048         | 1.83GB               | 1.83GB     | -3.5MB     (-0.19%) | 479.75MB               | 964MB        | -484.25MB (-50.23%) | 29.82% (4.77GB/16GB)
  | Context | 16     | 1024         | 1.83GB               | 1.83GB     | -3.5MB     (-0.19%) | 379.75MB               | 846MB        | -466.25MB (-55.11%) | 29.10% (4.66GB/16GB)
  | Context | 16     | 512          | 1.83GB               | 1.83GB     | -3.5MB     (-0.19%) | 329.75MB               | 812MB        | -482.25MB (-59.39%) | 28.89% (4.62GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 15     |              | 1.71GB               | 1.72GB     | -2.53MB    (-0.14%) |                        |              |                     | 23.22% (3.71GB/16GB)
  | Error   | 15     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 15     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 15     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 15     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 15     | 50653        | 1.71GB               | 1.72GB     | -2.53MB    (-0.14%) | 4.81GB                 | 9.62GB       | -4.81GB   (-49.98%) | 83.36% (13.34GB/16GB)
  | Context | 15     | 25258        | 1.71GB               | 1.72GB     | -2.53MB    (-0.14%) | 2.54GB                 | 4.83GB       | -2.29GB   (-47.47%) | 53.40% (8.54GB/16GB)
  | Context | 15     | 4096         | 1.71GB               | 1.72GB     | -2.53MB    (-0.14%) | 655.48MB               | 1.18GB       | -552.52MB (-45.74%) | 30.59% (4.89GB/16GB)
  | Context | 15     | 2048         | 1.71GB               | 1.72GB     | -2.53MB    (-0.14%) | 467.48MB               | 956MB        | -488.52MB (-51.10%) | 29.05% (4.65GB/16GB)
  | Context | 15     | 1024         | 1.71GB               | 1.72GB     | -2.53MB    (-0.14%) | 373.48MB               | 842MB        | -468.52MB (-55.64%) | 28.36% (4.54GB/16GB)
  | Context | 15     | 512          | 1.71GB               | 1.72GB     | -2.53MB    (-0.14%) | 326.48MB               | 810MB        | -483.52MB (-59.69%) | 28.16% (4.51GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 14     |              | 1.6GB                | 1.6GB      | -1.56MB    (-0.10%) |                        |              |                     | 22.50% (3.6GB/16GB)
  | Error   | 14     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 14     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 14     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 14     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 14     | 50653        | 1.6GB                | 1.6GB      | -1.56MB    (-0.10%) | 4.52GB                 | 9.43GB       | -4.91GB   (-52.03%) | 81.43% (13.03GB/16GB)
  | Context | 14     | 25258        | 1.6GB                | 1.6GB      | -1.56MB    (-0.10%) | 2.39GB                 | 4.73GB       | -2.34GB   (-49.47%) | 52.09% (8.33GB/16GB)
  | Context | 14     | 4096         | 1.6GB                | 1.6GB      | -1.56MB    (-0.10%) | 631.22MB               | 1.16GB       | -560.78MB (-47.05%) | 29.77% (4.76GB/16GB)
  | Context | 14     | 2048         | 1.6GB                | 1.6GB      | -1.56MB    (-0.10%) | 455.22MB               | 948MB        | -492.78MB (-51.98%) | 28.28% (4.52GB/16GB)
  | Context | 14     | 1024         | 1.6GB                | 1.6GB      | -1.56MB    (-0.10%) | 367.22MB               | 838MB        | -470.78MB (-56.18%) | 27.61% (4.42GB/16GB)
  | Context | 14     | 512          | 1.6GB                | 1.6GB      | -1.56MB    (-0.10%) | 323.22MB               | 808MB        | -484.78MB (-60.00%) | 27.43% (4.39GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 13     |              | 1.49GB               | 1.49GB     | -2.59MB    (-0.17%) |                        |              |                     | 21.79% (3.49GB/16GB)
  | Error   | 13     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 13     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 13     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 13     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 13     | 50653        | 1.49GB               | 1.49GB     | -2.59MB    (-0.17%) | 4.23GB                 | 9.24GB       | -5GB      (-54.16%) | 79.52% (12.72GB/16GB)
  | Context | 13     | 25258        | 1.49GB               | 1.49GB     | -2.59MB    (-0.17%) | 2.25GB                 | 4.64GB       | -2.39GB   (-51.55%) | 50.78% (8.12GB/16GB)
  | Context | 13     | 4096         | 1.49GB               | 1.49GB     | -2.59MB    (-0.17%) | 606.95MB               | 1.15GB       | -569.05MB (-48.39%) | 28.97% (4.63GB/16GB)
  | Context | 13     | 2048         | 1.49GB               | 1.49GB     | -2.59MB    (-0.17%) | 442.95MB               | 940MB        | -497.05MB (-52.88%) | 27.52% (4.4GB/16GB)
  | Context | 13     | 1024         | 1.49GB               | 1.49GB     | -2.59MB    (-0.17%) | 360.95MB               | 834MB        | -473.05MB (-56.72%) | 26.88% (4.3GB/16GB)
  | Context | 13     | 512          | 1.49GB               | 1.49GB     | -2.59MB    (-0.17%) | 319.95MB               | 806MB        | -486.05MB (-60.30%) | 26.71% (4.27GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 12     |              | 1.37GB               | 1.38GB     | -3.62MB    (-0.26%) |                        |              |                     | 21.08% (3.37GB/16GB)
  | Error   | 12     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 12     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 12     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 12     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 12     | 50653        | 1.37GB               | 1.38GB     | -3.62MB    (-0.26%) | 3.94GB                 | 9.04GB       | -5.1GB    (-56.39%) | 77.60% (12.42GB/16GB)
  | Context | 12     | 25258        | 1.37GB               | 1.38GB     | -3.62MB    (-0.26%) | 2.1GB                  | 4.54GB       | -2.44GB   (-53.71%) | 49.47% (7.92GB/16GB)
  | Context | 12     | 4096         | 1.37GB               | 1.38GB     | -3.62MB    (-0.26%) | 582.68MB               | 1.13GB       | -577.32MB (-49.77%) | 28.16% (4.51GB/16GB)
  | Context | 12     | 2048         | 1.37GB               | 1.38GB     | -3.62MB    (-0.26%) | 430.68MB               | 932MB        | -501.32MB (-53.79%) | 26.77% (4.28GB/16GB)
  | Context | 12     | 1024         | 1.37GB               | 1.38GB     | -3.62MB    (-0.26%) | 354.68MB               | 830MB        | -475.32MB (-57.27%) | 26.15% (4.18GB/16GB)
  | Context | 12     | 512          | 1.37GB               | 1.38GB     | -3.62MB    (-0.26%) | 316.68MB               | 804MB        | -487.32MB (-60.61%) | 25.99% (4.16GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 11     |              | 1.26GB               | 1.26GB     | -2.66MB    (-0.21%) |                        |              |                     | 20.36% (3.26GB/16GB)
  | Error   | 11     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 11     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 11     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 11     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 11     | 50653        | 1.26GB               | 1.26GB     | -2.66MB    (-0.21%) | 3.65GB                 | 8.85GB       | -5.2GB    (-58.72%) | 75.67% (12.11GB/16GB)
  | Context | 11     | 25258        | 1.26GB               | 1.26GB     | -2.66MB    (-0.21%) | 1.96GB                 | 4.45GB       | -2.49GB   (-55.95%) | 48.14% (7.7GB/16GB)
  | Context | 11     | 4096         | 1.26GB               | 1.26GB     | -2.66MB    (-0.21%) | 558.42MB               | 1.12GB       | -585.58MB (-51.19%) | 27.34% (4.37GB/16GB)
  | Context | 11     | 2048         | 1.26GB               | 1.26GB     | -2.66MB    (-0.21%) | 418.42MB               | 924MB        | -505.58MB (-54.72%) | 26.00% (4.16GB/16GB)
  | Context | 11     | 1024         | 1.26GB               | 1.26GB     | -2.66MB    (-0.21%) | 348.42MB               | 826MB        | -477.58MB (-57.82%) | 25.40% (4.06GB/16GB)
  | Context | 11     | 512          | 1.26GB               | 1.26GB     | -2.66MB    (-0.21%) | 313.42MB               | 802MB        | -488.58MB (-60.92%) | 25.25% (4.04GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 10     |              | 1.14GB               | 1.14GB     | -1.69MB    (-0.14%) |                        |              |                     | 19.64% (3.14GB/16GB)
  | Error   | 10     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 10     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 10     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 10     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 10     | 50653        | 1.14GB               | 1.14GB     | -1.69MB    (-0.14%) | 3.36GB                 | 8.66GB       | -5.29GB   (-61.15%) | 73.74% (11.8GB/16GB)
  | Context | 10     | 25258        | 1.14GB               | 1.14GB     | -1.69MB    (-0.14%) | 1.81GB                 | 4.35GB       | -2.53GB   (-58.29%) | 46.81% (7.49GB/16GB)
  | Context | 10     | 4096         | 1.14GB               | 1.14GB     | -1.69MB    (-0.14%) | 534.15MB               | 1.1GB        | -593.85MB (-52.65%) | 26.52% (4.24GB/16GB)
  | Context | 10     | 2048         | 1.14GB               | 1.14GB     | -1.69MB    (-0.14%) | 406.15MB               | 916MB        | -509.85MB (-55.66%) | 25.23% (4.04GB/16GB)
  | Context | 10     | 1024         | 1.14GB               | 1.14GB     | -1.69MB    (-0.14%) | 342.15MB               | 822MB        | -479.85MB (-58.38%) | 24.66% (3.94GB/16GB)
  | Context | 10     | 512          | 1.14GB               | 1.14GB     | -1.69MB    (-0.14%) | 310.15MB               | 800MB        | -489.85MB (-61.23%) | 24.52% (3.92GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 9      |              | 1.03GB               | 1.03GB     | -2.72MB    (-0.26%) |                        |              |                     | 18.93% (3.03GB/16GB)
  | Error   | 9      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 9      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 9      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 9      | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 9      | 50653        | 1.03GB               | 1.03GB     | -2.72MB    (-0.26%) | 3.07GB                 | 8.46GB       | -5.39GB   (-63.69%) | 71.83% (11.49GB/16GB)
  | Context | 9      | 25258        | 1.03GB               | 1.03GB     | -2.72MB    (-0.26%) | 1.67GB                 | 4.25GB       | -2.58GB   (-60.76%) | 45.51% (7.28GB/16GB)
  | Context | 9      | 4096         | 1.03GB               | 1.03GB     | -2.72MB    (-0.26%) | 509.88MB               | 1.09GB       | -602.12MB (-54.15%) | 25.72% (4.11GB/16GB)
  | Context | 9      | 2048         | 1.03GB               | 1.03GB     | -2.72MB    (-0.26%) | 393.88MB               | 908MB        | -514.12MB (-56.62%) | 24.47% (3.92GB/16GB)
  | Context | 9      | 1024         | 1.03GB               | 1.03GB     | -2.72MB    (-0.26%) | 335.88MB               | 818MB        | -482.12MB (-58.94%) | 23.92% (3.83GB/16GB)
  | Context | 9      | 512          | 1.03GB               | 1.03GB     | -2.72MB    (-0.26%) | 306.88MB               | 798MB        | -491.12MB (-61.54%) | 23.80% (3.81GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 8      |              | 936.25MB             | 940MB      | -3.75MB    (-0.40%) |                        |              |                     | 18.22% (2.92GB/16GB)
  | Error   | 8      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 8      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 8      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 8      | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 8      | 50653        | 936.25MB             | 940MB      | -3.75MB    (-0.40%) | 2.78GB                 | 8.27GB       | -5.49GB   (-66.35%) | 69.91% (11.19GB/16GB)
  | Context | 8      | 25258        | 936.25MB             | 940MB      | -3.75MB    (-0.40%) | 1.52GB                 | 4.16GB       | -2.63GB   (-63.34%) | 44.20% (7.07GB/16GB)
  | Context | 8      | 4096         | 936.25MB             | 940MB      | -3.75MB    (-0.40%) | 485.62MB               | 1.07GB       | -610.38MB (-55.69%) | 24.91% (3.99GB/16GB)
  | Context | 8      | 2048         | 936.25MB             | 940MB      | -3.75MB    (-0.40%) | 381.62MB               | 900MB        | -518.38MB (-57.60%) | 23.72% (3.79GB/16GB)
  | Context | 8      | 1024         | 936.25MB             | 940MB      | -3.75MB    (-0.40%) | 329.62MB               | 814MB        | -484.38MB (-59.51%) | 23.19% (3.71GB/16GB)
  | Context | 8      | 512          | 936.25MB             | 940MB      | -3.75MB    (-0.40%) | 303.62MB               | 796MB        | -492.38MB (-61.86%) | 23.08% (3.69GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 7      |              | 819.22MB             | 822MB      | -2.78MB    (-0.34%) |                        |              |                     | 17.50% (2.8GB/16GB)
  | Error   | 7      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 7      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 7      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 7      | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 7      | 50653        | 819.22MB             | 822MB      | -2.78MB    (-0.34%) | 2.49GB                 | 8.08GB       | -5.58GB   (-69.13%) | 67.98% (10.88GB/16GB)
  | Context | 7      | 25258        | 819.22MB             | 822MB      | -2.78MB    (-0.34%) | 1.38GB                 | 4.06GB       | -2.68GB   (-66.04%) | 42.88% (6.86GB/16GB)
  | Context | 7      | 4096         | 819.22MB             | 822MB      | -2.78MB    (-0.34%) | 461.35MB               | 1.05GB       | -618.65MB (-57.28%) | 24.09% (3.86GB/16GB)
  | Context | 7      | 2048         | 819.22MB             | 822MB      | -2.78MB    (-0.34%) | 369.35MB               | 892MB        | -522.65MB (-58.59%) | 22.95% (3.67GB/16GB)
  | Context | 7      | 1024         | 819.22MB             | 822MB      | -2.78MB    (-0.34%) | 323.35MB               | 810MB        | -486.65MB (-60.08%) | 22.45% (3.59GB/16GB)
  | Context | 7      | 512          | 819.22MB             | 822MB      | -2.78MB    (-0.34%) | 300.35MB               | 794MB        | -493.65MB (-62.17%) | 22.35% (3.58GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 6      |              | 702.19MB             | 704MB      | -1.81MB    (-0.26%) |                        |              |                     | 16.78% (2.69GB/16GB)
  | Error   | 6      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 6      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 6      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 6      | 82397        | 702.19MB             | 704MB      | -1.81MB    (-0.26%) | 3.41GB                 | 12.79GB      | -9.37GB   (-73.30%) | 96.70% (15.47GB/16GB)
  | Context | 6      | 50653        | 702.19MB             | 704MB      | -1.81MB    (-0.26%) | 2.2GB                  | 7.88GB       | -5.68GB   (-72.06%) | 66.05% (10.57GB/16GB)
  | Context | 6      | 25258        | 702.19MB             | 704MB      | -1.81MB    (-0.26%) | 1.23GB                 | 3.96GB       | -2.73GB   (-68.87%) | 41.56% (6.65GB/16GB)
  | Context | 6      | 4096         | 702.19MB             | 704MB      | -1.81MB    (-0.26%) | 437.08MB               | 1.04GB       | -626.92MB (-58.92%) | 23.28% (3.72GB/16GB)
  | Context | 6      | 2048         | 702.19MB             | 704MB      | -1.81MB    (-0.26%) | 357.08MB               | 884MB        | -526.92MB (-59.61%) | 22.18% (3.55GB/16GB)
  | Context | 6      | 1024         | 702.19MB             | 704MB      | -1.81MB    (-0.26%) | 317.08MB               | 806MB        | -488.92MB (-60.66%) | 21.70% (3.47GB/16GB)
  | Context | 6      | 512          | 702.19MB             | 704MB      | -1.81MB    (-0.26%) | 297.08MB               | 792MB        | -494.92MB (-62.49%) | 21.62% (3.46GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 5      |              | 585.16MB             | 588MB      | -2.84MB    (-0.48%) |                        |              |                     | 16.07% (2.57GB/16GB)
  | Error   | 5      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 5      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 5      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 5      | 82397        | 585.16MB             | 588MB      | -2.84MB    (-0.48%) | 2.94GB                 | 12.47GB      | -9.53GB   (-76.41%) | 94.03% (15.04GB/16GB)
  | Context | 5      | 50653        | 585.16MB             | 588MB      | -2.84MB    (-0.48%) | 1.91GB                 | 7.69GB       | -5.78GB   (-75.13%) | 64.13% (10.26GB/16GB)
  | Context | 5      | 25258        | 585.16MB             | 588MB      | -2.84MB    (-0.48%) | 1.09GB                 | 3.87GB       | -2.78GB   (-71.83%) | 40.24% (6.44GB/16GB)
  | Context | 5      | 4096         | 585.16MB             | 588MB      | -2.84MB    (-0.48%) | 412.82MB               | 1.02GB       | -635.18MB (-60.61%) | 22.47% (3.6GB/16GB)
  | Context | 5      | 2048         | 585.16MB             | 588MB      | -2.84MB    (-0.48%) | 344.82MB               | 876MB        | -531.18MB (-60.64%) | 21.42% (3.43GB/16GB)
  | Context | 5      | 1024         | 585.16MB             | 588MB      | -2.84MB    (-0.48%) | 310.82MB               | 802MB        | -491.18MB (-61.24%) | 20.97% (3.36GB/16GB)
  | Context | 5      | 512          | 585.16MB             | 588MB      | -2.84MB    (-0.48%) | 293.82MB               | 790MB        | -496.18MB (-62.81%) | 20.90% (3.34GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 4      |              | 468.13MB             | 472MB      | -3.87MB    (-0.82%) |                        |              |                     | 15.37% (2.46GB/16GB)
  | Error   | 4      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 4      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 4      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 4      | 82397        | 468.13MB             | 472MB      | -3.87MB    (-0.82%) | 2.47GB                 | 12.16GB      | -9.69GB   (-79.68%) | 91.36% (14.62GB/16GB)
  | Context | 4      | 50653        | 468.13MB             | 472MB      | -3.87MB    (-0.82%) | 1.62GB                 | 7.5GB        | -5.87GB   (-78.35%) | 62.22% (9.95GB/16GB)
  | Context | 4      | 25258        | 468.13MB             | 472MB      | -3.87MB    (-0.82%) | 967.2MB                | 3.77GB       | -2.83GB   (-74.94%) | 38.93% (6.23GB/16GB)
  | Context | 4      | 4096         | 468.13MB             | 472MB      | -3.87MB    (-0.82%) | 388.55MB               | 1.01GB       | -643.45MB (-62.35%) | 21.67% (3.47GB/16GB)
  | Context | 4      | 2048         | 468.13MB             | 472MB      | -3.87MB    (-0.82%) | 332.55MB               | 868MB        | -535.45MB (-61.69%) | 20.66% (3.31GB/16GB)
  | Context | 4      | 1024         | 468.13MB             | 472MB      | -3.87MB    (-0.82%) | 304.55MB               | 798MB        | -493.45MB (-61.84%) | 20.24% (3.24GB/16GB)
  | Context | 4      | 512          | 468.13MB             | 472MB      | -3.87MB    (-0.82%) | 290.55MB               | 788MB        | -497.45MB (-63.13%) | 20.18% (3.23GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 3      |              | 351.09MB             | 354MB      | -2.91MB    (-0.82%) |                        |              |                     | 14.65% (2.34GB/16GB)
  | Error   | 3      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 3      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 3      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 3      | 82397        | 351.09MB             | 354MB      | -2.91MB    (-0.82%) | 2GB                    | 11.84GB      | -9.85GB   (-83.13%) | 88.67% (14.19GB/16GB)
  | Context | 3      | 50653        | 351.09MB             | 354MB      | -2.91MB    (-0.82%) | 1.33GB                 | 7.3GB        | -5.97GB   (-81.75%) | 60.29% (9.65GB/16GB)
  | Context | 3      | 25258        | 351.09MB             | 354MB      | -2.91MB    (-0.82%) | 818.94MB               | 3.67GB       | -2.87GB   (-78.23%) | 37.61% (6.02GB/16GB)
  | Context | 3      | 4096         | 351.09MB             | 354MB      | -2.91MB    (-0.82%) | 364.28MB               | 1016MB       | -651.72MB (-64.15%) | 20.85% (3.34GB/16GB)
  | Context | 3      | 2048         | 351.09MB             | 354MB      | -2.91MB    (-0.82%) | 320.28MB               | 860MB        | -539.72MB (-62.76%) | 19.90% (3.18GB/16GB)
  | Context | 3      | 1024         | 351.09MB             | 354MB      | -2.91MB    (-0.82%) | 298.28MB               | 794MB        | -495.72MB (-62.43%) | 19.49% (3.12GB/16GB)
  | Context | 3      | 512          | 351.09MB             | 354MB      | -2.91MB    (-0.82%) | 287.28MB               | 786MB        | -498.72MB (-63.45%) | 19.44% (3.11GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 2      |              | 234.06MB             | 236MB      | -1.94MB    (-0.82%) |                        |              |                     | 13.93% (2.23GB/16GB)
  | Error   | 2      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 2      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 2      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 2      | 82397        | 234.06MB             | 236MB      | -1.94MB    (-0.82%) | 1.53GB                 | 11.53GB      | -10GB     (-86.76%) | 85.99% (13.76GB/16GB)
  | Context | 2      | 50653        | 234.06MB             | 236MB      | -1.94MB    (-0.82%) | 1.04GB                 | 7.11GB       | -6.07GB   (-85.34%) | 58.36% (9.34GB/16GB)
  | Context | 2      | 25258        | 234.06MB             | 236MB      | -1.94MB    (-0.82%) | 670.67MB               | 3.58GB       | -2.92GB   (-81.70%) | 36.29% (5.81GB/16GB)
  | Context | 2      | 4096         | 234.06MB             | 236MB      | -1.94MB    (-0.82%) | 340.02MB               | 1000MB       | -659.98MB (-66.00%) | 20.03% (3.2GB/16GB)
  | Context | 2      | 2048         | 234.06MB             | 236MB      | -1.94MB    (-0.82%) | 308.02MB               | 852MB        | -543.98MB (-63.85%) | 19.13% (3.06GB/16GB)
  | Context | 2      | 1024         | 234.06MB             | 236MB      | -1.94MB    (-0.82%) | 292.02MB               | 790MB        | -497.98MB (-63.04%) | 18.75% (3GB/16GB)
  | Context | 2      | 512          | 234.06MB             | 236MB      | -1.94MB    (-0.82%) | 284.02MB               | 784MB        | -499.98MB (-63.77%) | 18.71% (2.99GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 1      |              | 117.03MB             | 118MB      | -991.75KB  (-0.82%) |                        |              |                     | 13.21% (2.11GB/16GB)
  | Error   | 1      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 1      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 1      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 1      | 82397        | 117.03MB             | 118MB      | -991.75KB  (-0.82%) | 1.06GB                 | 6GB          | -4.94GB   (-82.40%) | 50.68% (8.11GB/16GB)
  | Context | 1      | 50653        | 117.03MB             | 118MB      | -991.75KB  (-0.82%) | 770.41MB               | 3.7GB        | -2.94GB   (-79.64%) | 36.30% (5.81GB/16GB)
  | Context | 1      | 25258        | 117.03MB             | 118MB      | -991.75KB  (-0.82%) | 522.41MB               | 1.86GB       | -1.35GB   (-72.53%) | 24.81% (3.97GB/16GB)
  | Context | 1      | 4096         | 117.03MB             | 118MB      | -991.75KB  (-0.82%) | 315.75MB               | 686MB        | -370.25MB (-53.97%) | 17.39% (2.78GB/16GB)
  | Context | 1      | 2048         | 117.03MB             | 118MB      | -991.75KB  (-0.82%) | 295.75MB               | 678MB        | -382.25MB (-56.38%) | 17.34% (2.77GB/16GB)
  | Context | 1      | 1024         | 117.03MB             | 118MB      | -991.75KB  (-0.82%) | 285.75MB               | 674MB        | -388.25MB (-57.60%) | 17.32% (2.77GB/16GB)
  | Context | 1      | 512          | 117.03MB             | 118MB      | -991.75KB  (-0.82%) | 280.75MB               | 672MB        | -391.25MB (-58.22%) | 17.31% (2.77GB/16GB)

Phi 2 Q8_0

D:\Projects\[REDACTED]\[REDACTED]>npx --no node-llama-cpp inspect measure --gpu cuda "[REDACTED]\model.gguf"
File: [REDACTED]\model.gguf
GPU: CUDA
mmap: enabled

  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
* | Model   | 33     |              | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) |                        |              |                     | 28.89% (4.62GB/16GB)
  | Context | 33     | 2048         | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 853.49MB               | 1.03GB       | -198.51MB (-18.87%) | 35.31% (5.65GB/16GB)
  | Context | 33     | 1877         | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 792.7MB                | 980MB        | -187.3MB  (-19.11%) | 34.87% (5.58GB/16GB)
  | Context | 33     | 1706         | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 731.92MB               | 906MB        | -174.08MB (-19.21%) | 34.42% (5.51GB/16GB)
  | Context | 33     | 1535         | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 671.14MB               | 822MB        | -150.86MB (-18.35%) | 33.91% (5.43GB/16GB)
  | Context | 33     | 1364         | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 610.36MB               | 748MB        | -137.64MB (-18.40%) | 33.46% (5.35GB/16GB)
  | Context | 33     | 1193         | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 549.57MB               | 674MB        | -124.43MB (-18.46%) | 33.01% (5.28GB/16GB)
  | Context | 33     | 1022         | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 488.79MB               | 598MB        | -109.21MB (-18.26%) | 32.54% (5.21GB/16GB)
  | Context | 33     | 852          | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 428.36MB               | 536MB        | -107.64MB (-20.08%) | 32.16% (5.15GB/16GB)
  | Context | 33     | 682          | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 367.94MB               | 474MB        | -106.06MB (-22.38%) | 31.79% (5.09GB/16GB)
  | Context | 33     | 512          | 2.62GB               | 2.63GB     | -1.54MB    (-0.06%) | 307.51MB               | 400MB        | -92.49MB  (-23.12%) | 31.33% (5.01GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 32     |              | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) |                        |              |                     | 28.09% (4.49GB/16GB)
  | Context | 32     | 2048         | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 853.49MB               | 1.02GB       | -188.51MB (-18.09%) | 34.45% (5.51GB/16GB)
  | Context | 32     | 1877         | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 792.7MB                | 982MB        | -189.3MB  (-19.28%) | 34.08% (5.45GB/16GB)
  | Context | 32     | 1706         | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 731.92MB               | 920MB        | -188.08MB (-20.44%) | 33.70% (5.39GB/16GB)
  | Context | 32     | 1535         | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 671.14MB               | 848MB        | -176.86MB (-20.86%) | 33.26% (5.32GB/16GB)
  | Context | 32     | 1364         | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 610.36MB               | 790MB        | -179.64MB (-22.74%) | 32.91% (5.27GB/16GB)
  | Context | 32     | 1193         | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 549.57MB               | 728MB        | -178.43MB (-24.51%) | 32.53% (5.2GB/16GB)
  | Context | 32     | 1022         | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 488.79MB               | 656MB        | -167.21MB (-25.49%) | 32.09% (5.13GB/16GB)
  | Context | 32     | 852          | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 428.36MB               | 596MB        | -167.64MB (-28.13%) | 31.72% (5.08GB/16GB)
  | Context | 32     | 682          | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 367.94MB               | 534MB        | -166.06MB (-31.10%) | 31.35% (5.02GB/16GB)
  | Context | 32     | 512          | 2.49GB               | 2.5GB      | -2.56MB    (-0.10%) | 307.51MB               | 462MB        | -154.49MB (-33.44%) | 30.91% (4.94GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 31     |              | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) |                        |              |                     | 27.60% (4.42GB/16GB)
  | Context | 31     | 2048         | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 850.72MB               | 1022MB       | -171.28MB (-16.76%) | 33.84% (5.41GB/16GB)
  | Context | 31     | 1877         | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 790.17MB               | 964MB        | -173.83MB (-18.03%) | 33.48% (5.36GB/16GB)
  | Context | 31     | 1706         | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 729.61MB               | 904MB        | -174.39MB (-19.29%) | 33.12% (5.3GB/16GB)
  | Context | 31     | 1535         | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 669.06MB               | 834MB        | -164.94MB (-19.78%) | 32.69% (5.23GB/16GB)
  | Context | 31     | 1364         | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 608.51MB               | 776MB        | -167.49MB (-21.58%) | 32.33% (5.17GB/16GB)
  | Context | 31     | 1193         | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 547.95MB               | 718MB        | -170.05MB (-23.68%) | 31.98% (5.12GB/16GB)
  | Context | 31     | 1022         | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 487.4MB                | 646MB        | -158.6MB  (-24.55%) | 31.54% (5.05GB/16GB)
  | Context | 31     | 852          | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 427.2MB                | 588MB        | -160.8MB  (-27.35%) | 31.19% (4.99GB/16GB)
  | Context | 31     | 682          | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 367MB                  | 528MB        | -161MB    (-30.49%) | 30.82% (4.93GB/16GB)
  | Context | 31     | 512          | 2.42GB               | 2.42GB     | -2.36MB    (-0.10%) | 306.8MB                | 458MB        | -151.2MB  (-33.01%) | 30.39% (4.86GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 30     |              | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) |                        |              |                     | 27.11% (4.34GB/16GB)
  | Context | 30     | 2048         | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 827.95MB               | 1002MB       | -174.05MB (-17.37%) | 33.23% (5.32GB/16GB)
  | Context | 30     | 1877         | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 769.3MB                | 946MB        | -176.7MB  (-18.68%) | 32.88% (5.26GB/16GB)
  | Context | 30     | 1706         | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 710.64MB               | 888MB        | -177.36MB (-19.97%) | 32.53% (5.2GB/16GB)
  | Context | 30     | 1535         | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 651.99MB               | 820MB        | -168.01MB (-20.49%) | 32.11% (5.14GB/16GB)
  | Context | 30     | 1364         | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 593.34MB               | 762MB        | -168.66MB (-22.13%) | 31.76% (5.08GB/16GB)
  | Context | 30     | 1193         | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 534.68MB               | 708MB        | -173.32MB (-24.48%) | 31.43% (5.03GB/16GB)
  | Context | 30     | 1022         | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 476.03MB               | 636MB        | -159.97MB (-25.15%) | 30.99% (4.96GB/16GB)
  | Context | 30     | 852          | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 417.72MB               | 580MB        | -162.28MB (-27.98%) | 30.65% (4.9GB/16GB)
  | Context | 30     | 682          | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 359.41MB               | 522MB        | -162.59MB (-31.15%) | 30.30% (4.85GB/16GB)
  | Context | 30     | 512          | 2.34GB               | 2.34GB     | -2.15MB    (-0.09%) | 301.1MB                | 454MB        | -152.9MB  (-33.68%) | 29.88% (4.78GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 29     |              | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) |                        |              |                     | 26.62% (4.26GB/16GB)
  | Context | 29     | 2048         | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 805.18MB               | 982MB        | -176.82MB (-18.01%) | 32.62% (5.22GB/16GB)
  | Context | 29     | 1877         | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 748.43MB               | 928MB        | -179.57MB (-19.35%) | 32.29% (5.17GB/16GB)
  | Context | 29     | 1706         | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 691.68MB               | 872MB        | -180.32MB (-20.68%) | 31.94% (5.11GB/16GB)
  | Context | 29     | 1535         | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 634.92MB               | 804MB        | -169.08MB (-21.03%) | 31.53% (5.04GB/16GB)
  | Context | 29     | 1364         | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 578.17MB               | 750MB        | -171.83MB (-22.91%) | 31.20% (4.99GB/16GB)
  | Context | 29     | 1193         | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 521.41MB               | 696MB        | -174.59MB (-25.08%) | 30.87% (4.94GB/16GB)
  | Context | 29     | 1022         | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 464.66MB               | 626MB        | -161.34MB (-25.77%) | 30.44% (4.87GB/16GB)
  | Context | 29     | 852          | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 408.24MB               | 572MB        | -163.76MB (-28.63%) | 30.11% (4.82GB/16GB)
  | Context | 29     | 682          | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 351.81MB               | 516MB        | -164.19MB (-31.82%) | 29.77% (4.76GB/16GB)
  | Context | 29     | 512          | 2.26GB               | 2.26GB     | -1.95MB    (-0.08%) | 295.39MB               | 448MB        | -152.61MB (-34.06%) | 29.36% (4.7GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 28     |              | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) |                        |              |                     | 26.13% (4.18GB/16GB)
  | Context | 28     | 2048         | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 782.42MB               | 962MB        | -179.58MB (-18.67%) | 32.01% (5.12GB/16GB)
  | Context | 28     | 1877         | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 727.56MB               | 910MB        | -182.44MB (-20.05%) | 31.69% (5.07GB/16GB)
  | Context | 28     | 1706         | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 672.71MB               | 856MB        | -183.29MB (-21.41%) | 31.36% (5.02GB/16GB)
  | Context | 28     | 1535         | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 617.85MB               | 788MB        | -170.15MB (-21.59%) | 30.94% (4.95GB/16GB)
  | Context | 28     | 1364         | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 563MB                  | 738MB        | -175MB    (-23.71%) | 30.64% (4.9GB/16GB)
  | Context | 28     | 1193         | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 508.14MB               | 684MB        | -175.86MB (-25.71%) | 30.31% (4.85GB/16GB)
  | Context | 28     | 1022         | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 453.29MB               | 616MB        | -162.71MB (-26.41%) | 29.89% (4.78GB/16GB)
  | Context | 28     | 852          | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 398.75MB               | 564MB        | -165.25MB (-29.30%) | 29.58% (4.73GB/16GB)
  | Context | 28     | 682          | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 344.22MB               | 510MB        | -165.78MB (-32.51%) | 29.25% (4.68GB/16GB)
  | Context | 28     | 512          | 2.18GB               | 2.18GB     | -1.74MB    (-0.08%) | 289.69MB               | 442MB        | -152.31MB (-34.46%) | 28.83% (4.61GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 27     |              | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) |                        |              |                     | 25.64% (4.1GB/16GB)
  | Context | 27     | 2048         | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 759.65MB               | 942MB        | -182.35MB (-19.36%) | 31.39% (5.02GB/16GB)
  | Context | 27     | 1877         | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 706.69MB               | 890MB        | -183.31MB (-20.60%) | 31.08% (4.97GB/16GB)
  | Context | 27     | 1706         | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 653.74MB               | 838MB        | -184.26MB (-21.99%) | 30.76% (4.92GB/16GB)
  | Context | 27     | 1535         | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 600.78MB               | 774MB        | -173.22MB (-22.38%) | 30.37% (4.86GB/16GB)
  | Context | 27     | 1364         | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 547.83MB               | 724MB        | -176.17MB (-24.33%) | 30.06% (4.81GB/16GB)
  | Context | 27     | 1193         | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 494.87MB               | 672MB        | -177.13MB (-26.36%) | 29.75% (4.76GB/16GB)
  | Context | 27     | 1022         | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 441.92MB               | 606MB        | -164.08MB (-27.08%) | 29.34% (4.69GB/16GB)
  | Context | 27     | 852          | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 389.27MB               | 554MB        | -164.73MB (-29.73%) | 29.03% (4.64GB/16GB)
  | Context | 27     | 682          | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 336.63MB               | 502MB        | -165.37MB (-32.94%) | 28.71% (4.59GB/16GB)
  | Context | 27     | 512          | 2.1GB                | 2.11GB     | -1.54MB    (-0.07%) | 283.98MB               | 438MB        | -154.02MB (-35.16%) | 28.32% (4.53GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 26     |              | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) |                        |              |                     | 25.16% (4.02GB/16GB)
  | Context | 26     | 2048         | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 736.88MB               | 922MB        | -185.12MB (-20.08%) | 30.78% (4.93GB/16GB)
  | Context | 26     | 1877         | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 685.83MB               | 870MB        | -184.17MB (-21.17%) | 30.47% (4.87GB/16GB)
  | Context | 26     | 1706         | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 634.77MB               | 820MB        | -185.23MB (-22.59%) | 30.16% (4.83GB/16GB)
  | Context | 26     | 1535         | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 583.71MB               | 760MB        | -176.29MB (-23.20%) | 29.80% (4.77GB/16GB)
  | Context | 26     | 1364         | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 532.66MB               | 710MB        | -177.34MB (-24.98%) | 29.49% (4.72GB/16GB)
  | Context | 26     | 1193         | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 481.6MB                | 660MB        | -178.4MB  (-27.03%) | 29.19% (4.67GB/16GB)
  | Context | 26     | 1022         | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 430.55MB               | 596MB        | -165.45MB (-27.76%) | 28.79% (4.61GB/16GB)
  | Context | 26     | 852          | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 379.79MB               | 544MB        | -164.21MB (-30.19%) | 28.48% (4.56GB/16GB)
  | Context | 26     | 682          | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 329.03MB               | 494MB        | -164.97MB (-33.39%) | 28.17% (4.51GB/16GB)
  | Context | 26     | 512          | 2.03GB               | 2.03GB     | -1.33MB    (-0.06%) | 278.28MB               | 434MB        | -155.72MB (-35.88%) | 27.81% (4.45GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 25     |              | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) |                        |              |                     | 24.67% (3.95GB/16GB)
  | Context | 25     | 2048         | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 714.11MB               | 902MB        | -187.89MB (-20.83%) | 30.17% (4.83GB/16GB)
  | Context | 25     | 1877         | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 664.96MB               | 852MB        | -187.04MB (-21.95%) | 29.87% (4.78GB/16GB)
  | Context | 25     | 1706         | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 615.8MB                | 804MB        | -188.2MB  (-23.41%) | 29.58% (4.73GB/16GB)
  | Context | 25     | 1535         | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 566.64MB               | 744MB        | -177.36MB (-23.84%) | 29.21% (4.67GB/16GB)
  | Context | 25     | 1364         | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 517.49MB               | 696MB        | -178.51MB (-25.65%) | 28.92% (4.63GB/16GB)
  | Context | 25     | 1193         | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 468.33MB               | 648MB        | -179.67MB (-27.73%) | 28.62% (4.58GB/16GB)
  | Context | 25     | 1022         | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 419.18MB               | 586MB        | -166.82MB (-28.47%) | 28.25% (4.52GB/16GB)
  | Context | 25     | 852          | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 370.31MB               | 536MB        | -165.69MB (-30.91%) | 27.94% (4.47GB/16GB)
  | Context | 25     | 682          | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 321.44MB               | 488MB        | -166.56MB (-34.13%) | 27.65% (4.42GB/16GB)
  | Context | 25     | 512          | 1.95GB               | 1.95GB     | -1.13MB    (-0.06%) | 272.57MB               | 428MB        | -155.43MB (-36.32%) | 27.28% (4.36GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 24     |              | 1.87GB               | 1.87GB     | -944KB     (-0.05%) |                        |              |                     | 24.18% (3.87GB/16GB)
  | Context | 24     | 2048         | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 691.35MB               | 882MB        | -190.65MB (-21.62%) | 29.56% (4.73GB/16GB)
  | Context | 24     | 1877         | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 644.09MB               | 834MB        | -189.91MB (-22.77%) | 29.27% (4.68GB/16GB)
  | Context | 24     | 1706         | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 596.83MB               | 788MB        | -191.17MB (-24.26%) | 28.99% (4.64GB/16GB)
  | Context | 24     | 1535         | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 549.58MB               | 728MB        | -178.42MB (-24.51%) | 28.62% (4.58GB/16GB)
  | Context | 24     | 1364         | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 502.32MB               | 682MB        | -179.68MB (-26.35%) | 28.34% (4.53GB/16GB)
  | Context | 24     | 1193         | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 455.06MB               | 636MB        | -180.94MB (-28.45%) | 28.06% (4.49GB/16GB)
  | Context | 24     | 1022         | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 407.8MB                | 576MB        | -168.2MB  (-29.20%) | 27.70% (4.43GB/16GB)
  | Context | 24     | 852          | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 360.82MB               | 528MB        | -167.18MB (-31.66%) | 27.40% (4.38GB/16GB)
  | Context | 24     | 682          | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 313.84MB               | 482MB        | -168.16MB (-34.89%) | 27.12% (4.34GB/16GB)
  | Context | 24     | 512          | 1.87GB               | 1.87GB     | -944KB     (-0.05%) | 266.86MB               | 422MB        | -155.14MB (-36.76%) | 26.76% (4.28GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 23     |              | 1.79GB               | 1.79GB     | -734KB     (-0.04%) |                        |              |                     | 23.69% (3.79GB/16GB)
  | Context | 23     | 2048         | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 668.58MB               | 862MB        | -193.42MB (-22.44%) | 28.95% (4.63GB/16GB)
  | Context | 23     | 1877         | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 623.22MB               | 816MB        | -192.78MB (-23.62%) | 28.67% (4.59GB/16GB)
  | Context | 23     | 1706         | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 577.86MB               | 770MB        | -192.14MB (-24.95%) | 28.39% (4.54GB/16GB)
  | Context | 23     | 1535         | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 532.51MB               | 714MB        | -181.49MB (-25.42%) | 28.05% (4.49GB/16GB)
  | Context | 23     | 1364         | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 487.15MB               | 668MB        | -180.85MB (-27.07%) | 27.77% (4.44GB/16GB)
  | Context | 23     | 1193         | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 441.79MB               | 624MB        | -182.21MB (-29.20%) | 27.50% (4.4GB/16GB)
  | Context | 23     | 1022         | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 396.43MB               | 566MB        | -169.57MB (-29.96%) | 27.15% (4.34GB/16GB)
  | Context | 23     | 852          | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 351.34MB               | 520MB        | -168.66MB (-32.43%) | 26.87% (4.3GB/16GB)
  | Context | 23     | 682          | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 306.25MB               | 474MB        | -167.75MB (-35.39%) | 26.58% (4.25GB/16GB)
  | Context | 23     | 512          | 1.79GB               | 1.79GB     | -734KB     (-0.04%) | 261.16MB               | 418MB        | -156.84MB (-37.52%) | 26.24% (4.2GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 22     |              | 1.71GB               | 1.71GB     | -524KB     (-0.03%) |                        |              |                     | 23.20% (3.71GB/16GB)
  | Context | 22     | 2048         | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 645.81MB               | 842MB        | -196.19MB (-23.30%) | 28.34% (4.53GB/16GB)
  | Context | 22     | 1877         | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 602.35MB               | 798MB        | -195.65MB (-24.52%) | 28.07% (4.49GB/16GB)
  | Context | 22     | 1706         | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 558.89MB               | 752MB        | -193.11MB (-25.68%) | 27.79% (4.45GB/16GB)
  | Context | 22     | 1535         | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 515.44MB               | 700MB        | -184.56MB (-26.37%) | 27.48% (4.4GB/16GB)
  | Context | 22     | 1364         | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 471.98MB               | 654MB        | -182.02MB (-27.83%) | 27.20% (4.35GB/16GB)
  | Context | 22     | 1193         | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 428.52MB               | 612MB        | -183.48MB (-29.98%) | 26.94% (4.31GB/16GB)
  | Context | 22     | 1022         | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 385.06MB               | 556MB        | -170.94MB (-30.74%) | 26.60% (4.26GB/16GB)
  | Context | 22     | 852          | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 341.86MB               | 512MB        | -170.14MB (-33.23%) | 26.33% (4.21GB/16GB)
  | Context | 22     | 682          | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 298.66MB               | 466MB        | -167.34MB (-35.91%) | 26.05% (4.17GB/16GB)
  | Context | 22     | 512          | 1.71GB               | 1.71GB     | -524KB     (-0.03%) | 255.45MB               | 414MB        | -158.55MB (-38.30%) | 25.73% (4.12GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 21     |              | 1.64GB               | 1.64GB     | -314KB     (-0.02%) |                        |              |                     | 22.72% (3.63GB/16GB)
  | Context | 21     | 2048         | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 623.04MB               | 822MB        | -198.96MB (-24.20%) | 27.73% (4.44GB/16GB)
  | Context | 21     | 1877         | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 581.48MB               | 780MB        | -198.52MB (-25.45%) | 27.48% (4.4GB/16GB)
  | Context | 21     | 1706         | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 539.93MB               | 736MB        | -196.07MB (-26.64%) | 27.21% (4.35GB/16GB)
  | Context | 21     | 1535         | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 498.37MB               | 684MB        | -185.63MB (-27.14%) | 26.89% (4.3GB/16GB)
  | Context | 21     | 1364         | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 456.81MB               | 642MB        | -185.19MB (-28.85%) | 26.63% (4.26GB/16GB)
  | Context | 21     | 1193         | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 415.25MB               | 600MB        | -184.75MB (-30.79%) | 26.38% (4.22GB/16GB)
  | Context | 21     | 1022         | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 373.69MB               | 546MB        | -172.31MB (-31.56%) | 26.05% (4.17GB/16GB)
  | Context | 21     | 852          | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 332.38MB               | 504MB        | -171.62MB (-34.05%) | 25.79% (4.13GB/16GB)
  | Context | 21     | 682          | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 291.06MB               | 460MB        | -168.94MB (-36.73%) | 25.52% (4.08GB/16GB)
  | Context | 21     | 512          | 1.64GB               | 1.64GB     | -314KB     (-0.02%) | 249.75MB               | 408MB        | -158.25MB (-38.79%) | 25.21% (4.03GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 20     |              | 1.56GB               | 1.56GB     | -104KB     (-0.01%) |                        |              |                     | 22.23% (3.56GB/16GB)
  | Context | 20     | 2048         | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 600.28MB               | 802MB        | -201.72MB (-25.15%) | 27.12% (4.34GB/16GB)
  | Context | 20     | 1877         | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 560.62MB               | 762MB        | -201.38MB (-26.43%) | 26.88% (4.3GB/16GB)
  | Context | 20     | 1706         | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 520.96MB               | 720MB        | -199.04MB (-27.64%) | 26.62% (4.26GB/16GB)
  | Context | 20     | 1535         | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 481.3MB                | 668MB        | -186.7MB  (-27.95%) | 26.30% (4.21GB/16GB)
  | Context | 20     | 1364         | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 441.64MB               | 630MB        | -188.36MB (-29.90%) | 26.07% (4.17GB/16GB)
  | Context | 20     | 1193         | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 401.98MB               | 588MB        | -186.02MB (-31.64%) | 25.82% (4.13GB/16GB)
  | Context | 20     | 1022         | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 362.32MB               | 536MB        | -173.68MB (-32.40%) | 25.50% (4.08GB/16GB)
  | Context | 20     | 852          | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 322.89MB               | 496MB        | -173.11MB (-34.90%) | 25.25% (4.04GB/16GB)
  | Context | 20     | 682          | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 283.47MB               | 454MB        | -170.53MB (-37.56%) | 25.00% (4GB/16GB)
  | Context | 20     | 512          | 1.56GB               | 1.56GB     | -104KB     (-0.01%) | 244.04MB               | 402MB        | -157.96MB (-39.29%) | 24.68% (3.95GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 19     |              | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) |                        |              |                     | 21.75% (3.48GB/16GB)
  | Context | 19     | 2048         | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 577.51MB               | 782MB        | -204.49MB (-26.15%) | 26.52% (4.24GB/16GB)
  | Context | 19     | 1877         | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 539.75MB               | 742MB        | -202.25MB (-27.26%) | 26.28% (4.2GB/16GB)
  | Context | 19     | 1706         | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 501.99MB               | 702MB        | -200.01MB (-28.49%) | 26.04% (4.17GB/16GB)
  | Context | 19     | 1535         | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 464.23MB               | 654MB        | -189.77MB (-29.02%) | 25.74% (4.12GB/16GB)
  | Context | 19     | 1364         | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 426.47MB               | 616MB        | -189.53MB (-30.77%) | 25.51% (4.08GB/16GB)
  | Context | 19     | 1193         | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 388.71MB               | 576MB        | -187.29MB (-32.52%) | 25.27% (4.04GB/16GB)
  | Context | 19     | 1022         | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 350.95MB               | 526MB        | -175.05MB (-33.28%) | 24.96% (3.99GB/16GB)
  | Context | 19     | 852          | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 313.41MB               | 486MB        | -172.59MB (-35.51%) | 24.72% (3.95GB/16GB)
  | Context | 19     | 682          | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 275.87MB               | 446MB        | -170.13MB (-38.15%) | 24.47% (3.92GB/16GB)
  | Context | 19     | 512          | 1.48GB               | 1.48GB     | -1.9MB     (-0.12%) | 238.33MB               | 398MB        | -159.67MB (-40.12%) | 24.18% (3.87GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 18     |              | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) |                        |              |                     | 21.27% (3.4GB/16GB)
  | Context | 18     | 2048         | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 554.74MB               | 762MB        | -207.26MB (-27.20%) | 25.93% (4.15GB/16GB)
  | Context | 18     | 1877         | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 518.88MB               | 722MB        | -203.12MB (-28.13%) | 25.68% (4.11GB/16GB)
  | Context | 18     | 1706         | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 483.02MB               | 684MB        | -200.98MB (-29.38%) | 25.45% (4.07GB/16GB)
  | Context | 18     | 1535         | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 447.16MB               | 640MB        | -192.84MB (-30.13%) | 25.18% (4.03GB/16GB)
  | Context | 18     | 1364         | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 411.3MB                | 602MB        | -190.7MB  (-31.68%) | 24.95% (3.99GB/16GB)
  | Context | 18     | 1193         | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 375.44MB               | 564MB        | -188.56MB (-33.43%) | 24.72% (3.95GB/16GB)
  | Context | 18     | 1022         | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 339.58MB               | 516MB        | -176.42MB (-34.19%) | 24.42% (3.91GB/16GB)
  | Context | 18     | 852          | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 303.93MB               | 476MB        | -172.07MB (-36.15%) | 24.18% (3.87GB/16GB)
  | Context | 18     | 682          | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 268.28MB               | 438MB        | -169.72MB (-38.75%) | 23.95% (3.83GB/16GB)
  | Context | 18     | 512          | 1.4GB                | 1.41GB     | -3.69MB    (-0.26%) | 232.63MB               | 394MB        | -161.37MB (-40.96%) | 23.68% (3.79GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 17     |              | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) |                        |              |                     | 20.79% (3.33GB/16GB)
  | Context | 17     | 2048         | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 531.97MB               | 742MB        | -210.03MB (-28.31%) | 25.32% (4.05GB/16GB)
  | Context | 17     | 1877         | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 498.01MB               | 704MB        | -205.99MB (-29.26%) | 25.08% (4.01GB/16GB)
  | Context | 17     | 1706         | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 464.05MB               | 668MB        | -203.95MB (-30.53%) | 24.86% (3.98GB/16GB)
  | Context | 17     | 1535         | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 430.09MB               | 624MB        | -193.91MB (-31.08%) | 24.60% (3.94GB/16GB)
  | Context | 17     | 1364         | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 396.13MB               | 588MB        | -191.87MB (-32.63%) | 24.38% (3.9GB/16GB)
  | Context | 17     | 1193         | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 362.17MB               | 552MB        | -189.83MB (-34.39%) | 24.16% (3.86GB/16GB)
  | Context | 17     | 1022         | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 328.21MB               | 506MB        | -177.79MB (-35.14%) | 23.87% (3.82GB/16GB)
  | Context | 17     | 852          | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 294.45MB               | 468MB        | -173.55MB (-37.08%) | 23.64% (3.78GB/16GB)
  | Context | 17     | 682          | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 260.68MB               | 432MB        | -171.32MB (-39.66%) | 23.42% (3.75GB/16GB)
  | Context | 17     | 512          | 1.32GB               | 1.33GB     | -3.49MB    (-0.26%) | 226.92MB               | 388MB        | -161.08MB (-41.51%) | 23.15% (3.7GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 16     |              | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) |                        |              |                     | 20.30% (3.25GB/16GB)
  | Context | 16     | 2048         | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 509.2MB                | 722MB        | -212.8MB  (-29.47%) | 24.71% (3.95GB/16GB)
  | Context | 16     | 1877         | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 477.14MB               | 686MB        | -208.86MB (-30.45%) | 24.49% (3.92GB/16GB)
  | Context | 16     | 1706         | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 445.08MB               | 652MB        | -206.92MB (-31.74%) | 24.28% (3.88GB/16GB)
  | Context | 16     | 1535         | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 413.02MB               | 608MB        | -194.98MB (-32.07%) | 24.01% (3.84GB/16GB)
  | Context | 16     | 1364         | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 380.96MB               | 574MB        | -193.04MB (-33.63%) | 23.80% (3.81GB/16GB)
  | Context | 16     | 1193         | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 348.9MB                | 540MB        | -191.1MB  (-35.39%) | 23.59% (3.77GB/16GB)
  | Context | 16     | 1022         | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 316.84MB               | 496MB        | -179.16MB (-36.12%) | 23.33% (3.73GB/16GB)
  | Context | 16     | 852          | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 284.96MB               | 460MB        | -175.04MB (-38.05%) | 23.11% (3.7GB/16GB)
  | Context | 16     | 682          | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 253.09MB               | 426MB        | -172.91MB (-40.59%) | 22.90% (3.66GB/16GB)
  | Context | 16     | 512          | 1.25GB               | 1.25GB     | -3.28MB    (-0.26%) | 221.22MB               | 382MB        | -160.78MB (-42.09%) | 22.63% (3.62GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 15     |              | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) |                        |              |                     | 19.81% (3.17GB/16GB)
  | Context | 15     | 2048         | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 486.44MB               | 702MB        | -215.56MB (-30.71%) | 24.09% (3.86GB/16GB)
  | Context | 15     | 1877         | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 456.28MB               | 668MB        | -211.72MB (-31.70%) | 23.89% (3.82GB/16GB)
  | Context | 15     | 1706         | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 426.11MB               | 636MB        | -209.89MB (-33.00%) | 23.69% (3.79GB/16GB)
  | Context | 15     | 1535         | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 395.95MB               | 594MB        | -198.05MB (-33.34%) | 23.44% (3.75GB/16GB)
  | Context | 15     | 1364         | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 365.79MB               | 562MB        | -196.21MB (-34.91%) | 23.24% (3.72GB/16GB)
  | Context | 15     | 1193         | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 335.63MB               | 528MB        | -192.37MB (-36.43%) | 23.03% (3.69GB/16GB)
  | Context | 15     | 1022         | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 305.47MB               | 486MB        | -180.53MB (-37.15%) | 22.78% (3.64GB/16GB)
  | Context | 15     | 852          | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 275.48MB               | 452MB        | -176.52MB (-39.05%) | 22.57% (3.61GB/16GB)
  | Context | 15     | 682          | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 245.5MB                | 420MB        | -174.5MB  (-41.55%) | 22.37% (3.58GB/16GB)
  | Context | 15     | 512          | 1.17GB               | 1.17GB     | -3.08MB    (-0.26%) | 215.51MB               | 378MB        | -162.49MB (-42.99%) | 22.12% (3.54GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 14     |              | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) |                        |              |                     | 19.32% (3.09GB/16GB)
  | Context | 14     | 2048         | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 463.67MB               | 682MB        | -218.33MB (-32.01%) | 23.48% (3.76GB/16GB)
  | Context | 14     | 1877         | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 435.41MB               | 650MB        | -214.59MB (-33.01%) | 23.29% (3.73GB/16GB)
  | Context | 14     | 1706         | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 407.14MB               | 620MB        | -212.86MB (-34.33%) | 23.11% (3.7GB/16GB)
  | Context | 14     | 1535         | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 378.88MB               | 580MB        | -201.12MB (-34.68%) | 22.86% (3.66GB/16GB)
  | Context | 14     | 1364         | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 350.62MB               | 550MB        | -199.38MB (-36.25%) | 22.68% (3.63GB/16GB)
  | Context | 14     | 1193         | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 322.36MB               | 516MB        | -193.64MB (-37.53%) | 22.47% (3.6GB/16GB)
  | Context | 14     | 1022         | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 294.1MB                | 476MB        | -181.9MB  (-38.22%) | 22.23% (3.56GB/16GB)
  | Context | 14     | 852          | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 266MB                  | 444MB        | -178MB    (-40.09%) | 22.03% (3.52GB/16GB)
  | Context | 14     | 682          | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 237.9MB                | 414MB        | -176.1MB  (-42.54%) | 21.85% (3.5GB/16GB)
  | Context | 14     | 512          | 1.09GB               | 1.09GB     | -2.87MB    (-0.26%) | 209.8MB                | 374MB        | -164.2MB  (-43.90%) | 21.60% (3.46GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 13     |              | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) |                        |              |                     | 18.83% (3.01GB/16GB)
  | Context | 13     | 2048         | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 440.9MB                | 662MB        | -221.1MB  (-33.40%) | 22.87% (3.66GB/16GB)
  | Context | 13     | 1877         | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 414.54MB               | 632MB        | -217.46MB (-34.41%) | 22.69% (3.63GB/16GB)
  | Context | 13     | 1706         | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 388.18MB               | 602MB        | -213.82MB (-35.52%) | 22.51% (3.6GB/16GB)
  | Context | 13     | 1535         | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 361.81MB               | 564MB        | -202.19MB (-35.85%) | 22.28% (3.56GB/16GB)
  | Context | 13     | 1364         | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 335.45MB               | 536MB        | -200.55MB (-37.42%) | 22.10% (3.54GB/16GB)
  | Context | 13     | 1193         | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 309.09MB               | 504MB        | -194.91MB (-38.67%) | 21.91% (3.51GB/16GB)
  | Context | 13     | 1022         | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 282.72MB               | 466MB        | -183.28MB (-39.33%) | 21.68% (3.47GB/16GB)
  | Context | 13     | 852          | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 256.52MB               | 436MB        | -179.48MB (-41.17%) | 21.49% (3.44GB/16GB)
  | Context | 13     | 682          | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 230.31MB               | 406MB        | -175.69MB (-43.27%) | 21.31% (3.41GB/16GB)
  | Context | 13     | 512          | 1.01GB               | 1.02GB     | -2.67MB    (-0.26%) | 204.1MB                | 368MB        | -163.9MB  (-44.54%) | 21.08% (3.37GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 12     |              | 957.54MB             | 960MB      | -2.46MB    (-0.26%) |                        |              |                     | 18.34% (2.94GB/16GB)
  | Context | 12     | 2048         | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 418.13MB               | 642MB        | -223.87MB (-34.87%) | 22.26% (3.56GB/16GB)
  | Context | 12     | 1877         | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 393.67MB               | 614MB        | -220.33MB (-35.88%) | 22.09% (3.53GB/16GB)
  | Context | 12     | 1706         | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 369.21MB               | 584MB        | -214.79MB (-36.78%) | 21.91% (3.51GB/16GB)
  | Context | 12     | 1535         | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 344.74MB               | 548MB        | -203.26MB (-37.09%) | 21.69% (3.47GB/16GB)
  | Context | 12     | 1364         | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 320.28MB               | 522MB        | -201.72MB (-38.64%) | 21.53% (3.44GB/16GB)
  | Context | 12     | 1193         | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 295.82MB               | 492MB        | -196.18MB (-39.87%) | 21.35% (3.42GB/16GB)
  | Context | 12     | 1022         | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 271.35MB               | 456MB        | -184.65MB (-40.49%) | 21.13% (3.38GB/16GB)
  | Context | 12     | 852          | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 247.03MB               | 428MB        | -180.97MB (-42.28%) | 20.96% (3.35GB/16GB)
  | Context | 12     | 682          | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 222.71MB               | 398MB        | -175.29MB (-44.04%) | 20.77% (3.32GB/16GB)
  | Context | 12     | 512          | 957.54MB             | 960MB      | -2.46MB    (-0.26%) | 198.39MB               | 362MB        | -163.61MB (-45.20%) | 20.55% (3.29GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 11     |              | 877.74MB             | 880MB      | -2.26MB    (-0.26%) |                        |              |                     | 17.86% (2.86GB/16GB)
  | Context | 11     | 2048         | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 395.37MB               | 622MB        | -226.63MB (-36.44%) | 21.65% (3.46GB/16GB)
  | Context | 11     | 1877         | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 372.8MB                | 596MB        | -223.2MB  (-37.45%) | 21.49% (3.44GB/16GB)
  | Context | 11     | 1706         | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 350.24MB               | 568MB        | -217.76MB (-38.34%) | 21.32% (3.41GB/16GB)
  | Context | 11     | 1535         | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 327.67MB               | 534MB        | -206.33MB (-38.64%) | 21.12% (3.38GB/16GB)
  | Context | 11     | 1364         | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 305.11MB               | 508MB        | -202.89MB (-39.94%) | 20.96% (3.35GB/16GB)
  | Context | 11     | 1193         | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 282.55MB               | 480MB        | -197.45MB (-41.14%) | 20.79% (3.33GB/16GB)
  | Context | 11     | 1022         | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 259.98MB               | 446MB        | -186.02MB (-41.71%) | 20.58% (3.29GB/16GB)
  | Context | 11     | 852          | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 237.55MB               | 420MB        | -182.45MB (-43.44%) | 20.42% (3.27GB/16GB)
  | Context | 11     | 682          | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 215.12MB               | 392MB        | -176.88MB (-45.12%) | 20.25% (3.24GB/16GB)
  | Context | 11     | 512          | 877.74MB             | 880MB      | -2.26MB    (-0.26%) | 192.69MB               | 358MB        | -165.31MB (-46.18%) | 20.04% (3.21GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 10     |              | 797.95MB             | 800MB      | -2.05MB    (-0.26%) |                        |              |                     | 17.37% (2.78GB/16GB)
  | Context | 10     | 2048         | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 372.6MB                | 602MB        | -229.4MB  (-38.11%) | 21.04% (3.37GB/16GB)
  | Context | 10     | 1877         | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 351.93MB               | 578MB        | -226.07MB (-39.11%) | 20.90% (3.34GB/16GB)
  | Context | 10     | 1706         | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 331.27MB               | 552MB        | -220.73MB (-39.99%) | 20.74% (3.32GB/16GB)
  | Context | 10     | 1535         | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 310.61MB               | 520MB        | -209.39MB (-40.27%) | 20.54% (3.29GB/16GB)
  | Context | 10     | 1364         | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 289.94MB               | 494MB        | -204.06MB (-41.31%) | 20.38% (3.26GB/16GB)
  | Context | 10     | 1193         | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 269.28MB               | 468MB        | -198.72MB (-42.46%) | 20.22% (3.24GB/16GB)
  | Context | 10     | 1022         | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 248.61MB               | 436MB        | -187.39MB (-42.98%) | 20.03% (3.2GB/16GB)
  | Context | 10     | 852          | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 228.07MB               | 412MB        | -183.93MB (-44.64%) | 19.88% (3.18GB/16GB)
  | Context | 10     | 682          | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 207.53MB               | 386MB        | -178.47MB (-46.24%) | 19.72% (3.16GB/16GB)
  | Context | 10     | 512          | 797.95MB             | 800MB      | -2.05MB    (-0.26%) | 186.98MB               | 354MB        | -167.02MB (-47.18%) | 19.53% (3.12GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 9      |              | 718.15MB             | 720MB      | -1.85MB    (-0.26%) |                        |              |                     | 16.88% (2.7GB/16GB)
  | Context | 9      | 2048         | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 349.83MB               | 582MB        | -232.17MB (-39.89%) | 20.43% (3.27GB/16GB)
  | Context | 9      | 1877         | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 331.07MB               | 558MB        | -226.93MB (-40.67%) | 20.29% (3.25GB/16GB)
  | Context | 9      | 1706         | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 312.3MB                | 534MB        | -221.7MB  (-41.52%) | 20.14% (3.22GB/16GB)
  | Context | 9      | 1535         | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 293.54MB               | 504MB        | -210.46MB (-41.76%) | 19.96% (3.19GB/16GB)
  | Context | 9      | 1364         | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 274.77MB               | 480MB        | -205.23MB (-42.76%) | 19.81% (3.17GB/16GB)
  | Context | 9      | 1193         | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 256.01MB               | 456MB        | -199.99MB (-43.86%) | 19.66% (3.15GB/16GB)
  | Context | 9      | 1022         | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 237.24MB               | 426MB        | -188.76MB (-44.31%) | 19.48% (3.12GB/16GB)
  | Context | 9      | 852          | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 218.59MB               | 402MB        | -183.41MB (-45.63%) | 19.33% (3.09GB/16GB)
  | Context | 9      | 682          | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 199.93MB               | 378MB        | -178.07MB (-47.11%) | 19.19% (3.07GB/16GB)
  | Context | 9      | 512          | 718.15MB             | 720MB      | -1.85MB    (-0.26%) | 181.28MB               | 348MB        | -166.72MB (-47.91%) | 19.00% (3.04GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 8      |              | 638.36MB             | 640MB      | -1.64MB    (-0.26%) |                        |              |                     | 16.39% (2.62GB/16GB)
  | Context | 8      | 2048         | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 327.06MB               | 562MB        | -234.94MB (-41.80%) | 19.82% (3.17GB/16GB)
  | Context | 8      | 1877         | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 310.2MB                | 538MB        | -227.8MB  (-42.34%) | 19.68% (3.15GB/16GB)
  | Context | 8      | 1706         | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 293.33MB               | 516MB        | -222.67MB (-43.15%) | 19.54% (3.13GB/16GB)
  | Context | 8      | 1535         | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 276.47MB               | 488MB        | -211.53MB (-43.35%) | 19.37% (3.1GB/16GB)
  | Context | 8      | 1364         | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 259.6MB                | 466MB        | -206.4MB  (-44.29%) | 19.24% (3.08GB/16GB)
  | Context | 8      | 1193         | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 242.74MB               | 444MB        | -201.26MB (-45.33%) | 19.10% (3.06GB/16GB)
  | Context | 8      | 1022         | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 225.87MB               | 416MB        | -190.13MB (-45.70%) | 18.93% (3.03GB/16GB)
  | Context | 8      | 852          | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 209.1MB                | 392MB        | -182.9MB  (-46.66%) | 18.78% (3.01GB/16GB)
  | Context | 8      | 682          | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 192.34MB               | 370MB        | -177.66MB (-48.02%) | 18.65% (2.98GB/16GB)
  | Context | 8      | 512          | 638.36MB             | 640MB      | -1.64MB    (-0.26%) | 175.57MB               | 342MB        | -166.43MB (-48.66%) | 18.48% (2.96GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 7      |              | 558.56MB             | 560MB      | -1.44MB    (-0.26%) |                        |              |                     | 15.90% (2.54GB/16GB)
  | Context | 7      | 2048         | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 304.3MB                | 542MB        | -237.7MB  (-43.86%) | 19.21% (3.07GB/16GB)
  | Context | 7      | 1877         | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 289.33MB               | 520MB        | -230.67MB (-44.36%) | 19.08% (3.05GB/16GB)
  | Context | 7      | 1706         | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 274.36MB               | 500MB        | -225.64MB (-45.13%) | 18.96% (3.03GB/16GB)
  | Context | 7      | 1535         | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 259.4MB                | 474MB        | -214.6MB  (-45.27%) | 18.80% (3.01GB/16GB)
  | Context | 7      | 1364         | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 244.43MB               | 454MB        | -209.57MB (-46.16%) | 18.67% (2.99GB/16GB)
  | Context | 7      | 1193         | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 229.47MB               | 432MB        | -202.53MB (-46.88%) | 18.54% (2.97GB/16GB)
  | Context | 7      | 1022         | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 214.5MB                | 406MB        | -191.5MB  (-47.17%) | 18.38% (2.94GB/16GB)
  | Context | 7      | 852          | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 199.62MB               | 384MB        | -184.38MB (-48.02%) | 18.25% (2.92GB/16GB)
  | Context | 7      | 682          | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 184.74MB               | 364MB        | -179.26MB (-49.25%) | 18.13% (2.9GB/16GB)
  | Context | 7      | 512          | 558.56MB             | 560MB      | -1.44MB    (-0.26%) | 169.86MB               | 338MB        | -168.14MB (-49.74%) | 17.97% (2.87GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 6      |              | 478.77MB             | 480MB      | -1.23MB    (-0.26%) |                        |              |                     | 15.42% (2.47GB/16GB)
  | Context | 6      | 2048         | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 281.53MB               | 522MB        | -240.47MB (-46.07%) | 18.60% (2.98GB/16GB)
  | Context | 6      | 1877         | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 268.46MB               | 502MB        | -233.54MB (-46.52%) | 18.48% (2.96GB/16GB)
  | Context | 6      | 1706         | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 255.4MB                | 484MB        | -228.6MB  (-47.23%) | 18.37% (2.94GB/16GB)
  | Context | 6      | 1535         | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 242.33MB               | 460MB        | -217.67MB (-47.32%) | 18.22% (2.92GB/16GB)
  | Context | 6      | 1364         | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 229.26MB               | 442MB        | -212.74MB (-48.13%) | 18.11% (2.9GB/16GB)
  | Context | 6      | 1193         | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 216.2MB                | 420MB        | -203.8MB  (-48.52%) | 17.98% (2.88GB/16GB)
  | Context | 6      | 1022         | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 203.13MB               | 396MB        | -192.87MB (-48.70%) | 17.83% (2.85GB/16GB)
  | Context | 6      | 852          | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 190.14MB               | 376MB        | -185.86MB (-49.43%) | 17.71% (2.83GB/16GB)
  | Context | 6      | 682          | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 177.15MB               | 358MB        | -180.85MB (-50.52%) | 17.60% (2.82GB/16GB)
  | Context | 6      | 512          | 478.77MB             | 480MB      | -1.23MB    (-0.26%) | 164.16MB               | 334MB        | -169.84MB (-50.85%) | 17.45% (2.79GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 5      |              | 398.97MB             | 400MB      | -1.03MB    (-0.26%) |                        |              |                     | 14.93% (2.39GB/16GB)
  | Context | 5      | 2048         | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 258.76MB               | 502MB        | -243.24MB (-48.45%) | 17.99% (2.88GB/16GB)
  | Context | 5      | 1877         | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 247.59MB               | 484MB        | -236.41MB (-48.84%) | 17.88% (2.86GB/16GB)
  | Context | 5      | 1706         | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 236.43MB               | 466MB        | -229.57MB (-49.26%) | 17.77% (2.84GB/16GB)
  | Context | 5      | 1535         | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 225.26MB               | 444MB        | -218.74MB (-49.27%) | 17.64% (2.82GB/16GB)
  | Context | 5      | 1364         | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 214.09MB               | 428MB        | -213.91MB (-49.98%) | 17.54% (2.81GB/16GB)
  | Context | 5      | 1193         | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 202.93MB               | 408MB        | -205.07MB (-50.26%) | 17.42% (2.79GB/16GB)
  | Context | 5      | 1022         | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 191.76MB               | 386MB        | -194.24MB (-50.32%) | 17.28% (2.77GB/16GB)
  | Context | 5      | 852          | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 180.66MB               | 368MB        | -187.34MB (-50.91%) | 17.17% (2.75GB/16GB)
  | Context | 5      | 682          | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 169.55MB               | 350MB        | -180.45MB (-51.56%) | 17.06% (2.73GB/16GB)
  | Context | 5      | 512          | 398.97MB             | 400MB      | -1.03MB    (-0.26%) | 158.45MB               | 328MB        | -169.55MB (-51.69%) | 16.93% (2.71GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 4      |              | 319.18MB             | 320MB      | -840KB     (-0.26%) |                        |              |                     | 14.44% (2.31GB/16GB)
  | Context | 4      | 2048         | 319.18MB             | 320MB      | -840KB     (-0.26%) | 235.99MB               | 482MB        | -246.01MB (-51.04%) | 17.38% (2.78GB/16GB)
  | Context | 4      | 1877         | 319.18MB             | 320MB      | -840KB     (-0.26%) | 226.73MB               | 466MB        | -239.27MB (-51.35%) | 17.28% (2.77GB/16GB)
  | Context | 4      | 1706         | 319.18MB             | 320MB      | -840KB     (-0.26%) | 217.46MB               | 448MB        | -230.54MB (-51.46%) | 17.17% (2.75GB/16GB)
  | Context | 4      | 1535         | 319.18MB             | 320MB      | -840KB     (-0.26%) | 208.19MB               | 428MB        | -219.81MB (-51.36%) | 17.05% (2.73GB/16GB)
  | Context | 4      | 1364         | 319.18MB             | 320MB      | -840KB     (-0.26%) | 198.92MB               | 414MB        | -215.08MB (-51.95%) | 16.97% (2.71GB/16GB)
  | Context | 4      | 1193         | 319.18MB             | 320MB      | -840KB     (-0.26%) | 189.65MB               | 396MB        | -206.35MB (-52.11%) | 16.86% (2.7GB/16GB)
  | Context | 4      | 1022         | 319.18MB             | 320MB      | -840KB     (-0.26%) | 180.39MB               | 376MB        | -195.61MB (-52.02%) | 16.73% (2.68GB/16GB)
  | Context | 4      | 852          | 319.18MB             | 320MB      | -840KB     (-0.26%) | 171.17MB               | 360MB        | -188.83MB (-52.45%) | 16.64% (2.66GB/16GB)
  | Context | 4      | 682          | 319.18MB             | 320MB      | -840KB     (-0.26%) | 161.96MB               | 342MB        | -180.04MB (-52.64%) | 16.53% (2.64GB/16GB)
  | Context | 4      | 512          | 319.18MB             | 320MB      | -840KB     (-0.26%) | 152.75MB               | 322MB        | -169.25MB (-52.56%) | 16.40% (2.62GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 3      |              | 239.38MB             | 240MB      | -630KB     (-0.26%) |                        |              |                     | 13.95% (2.23GB/16GB)
  | Context | 3      | 2048         | 239.38MB             | 240MB      | -630KB     (-0.26%) | 213.23MB               | 462MB        | -248.77MB (-53.85%) | 16.77% (2.68GB/16GB)
  | Context | 3      | 1877         | 239.38MB             | 240MB      | -630KB     (-0.26%) | 205.86MB               | 448MB        | -242.14MB (-54.05%) | 16.68% (2.67GB/16GB)
  | Context | 3      | 1706         | 239.38MB             | 240MB      | -630KB     (-0.26%) | 198.49MB               | 432MB        | -233.51MB (-54.05%) | 16.59% (2.65GB/16GB)
  | Context | 3      | 1535         | 239.38MB             | 240MB      | -630KB     (-0.26%) | 191.12MB               | 414MB        | -222.88MB (-53.84%) | 16.48% (2.64GB/16GB)
  | Context | 3      | 1364         | 239.38MB             | 240MB      | -630KB     (-0.26%) | 183.75MB               | 400MB        | -216.25MB (-54.06%) | 16.39% (2.62GB/16GB)
  | Context | 3      | 1193         | 239.38MB             | 240MB      | -630KB     (-0.26%) | 176.38MB               | 384MB        | -207.62MB (-54.07%) | 16.29% (2.61GB/16GB)
  | Context | 3      | 1022         | 239.38MB             | 240MB      | -630KB     (-0.26%) | 169.02MB               | 366MB        | -196.98MB (-53.82%) | 16.18% (2.59GB/16GB)
  | Context | 3      | 852          | 239.38MB             | 240MB      | -630KB     (-0.26%) | 161.69MB               | 352MB        | -190.31MB (-54.07%) | 16.10% (2.58GB/16GB)
  | Context | 3      | 682          | 239.38MB             | 240MB      | -630KB     (-0.26%) | 154.37MB               | 336MB        | -181.63MB (-54.06%) | 16.00% (2.56GB/16GB)
  | Context | 3      | 512          | 239.38MB             | 240MB      | -630KB     (-0.26%) | 147.04MB               | 318MB        | -170.96MB (-53.76%) | 15.89% (2.54GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 2      |              | 159.59MB             | 160MB      | -420KB     (-0.26%) |                        |              |                     | 13.46% (2.15GB/16GB)
  | Context | 2      | 2048         | 159.59MB             | 160MB      | -420KB     (-0.26%) | 190.46MB               | 442MB        | -251.54MB (-56.91%) | 16.16% (2.59GB/16GB)
  | Context | 2      | 1877         | 159.59MB             | 160MB      | -420KB     (-0.26%) | 184.99MB               | 430MB        | -245.01MB (-56.98%) | 16.09% (2.57GB/16GB)
  | Context | 2      | 1706         | 159.59MB             | 160MB      | -420KB     (-0.26%) | 179.52MB               | 416MB        | -236.48MB (-56.85%) | 16.00% (2.56GB/16GB)
  | Context | 2      | 1535         | 159.59MB             | 160MB      | -420KB     (-0.26%) | 174.05MB               | 400MB        | -225.95MB (-56.49%) | 15.90% (2.54GB/16GB)
  | Context | 2      | 1364         | 159.59MB             | 160MB      | -420KB     (-0.26%) | 168.58MB               | 386MB        | -217.42MB (-56.33%) | 15.82% (2.53GB/16GB)
  | Context | 2      | 1193         | 159.59MB             | 160MB      | -420KB     (-0.26%) | 163.11MB               | 372MB        | -208.89MB (-56.15%) | 15.73% (2.52GB/16GB)
  | Context | 2      | 1022         | 159.59MB             | 160MB      | -420KB     (-0.26%) | 157.65MB               | 356MB        | -198.35MB (-55.72%) | 15.63% (2.5GB/16GB)
  | Context | 2      | 852          | 159.59MB             | 160MB      | -420KB     (-0.26%) | 152.21MB               | 344MB        | -191.79MB (-55.75%) | 15.56% (2.49GB/16GB)
  | Context | 2      | 682          | 159.59MB             | 160MB      | -420KB     (-0.26%) | 146.77MB               | 330MB        | -183.23MB (-55.52%) | 15.48% (2.48GB/16GB)
  | Context | 2      | 512          | 159.59MB             | 160MB      | -420KB     (-0.26%) | 141.33MB               | 314MB        | -172.67MB (-54.99%) | 15.38% (2.46GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 1      |              | 79.79MB              | 80MB       | -210KB     (-0.26%) |                        |              |                     | 12.97% (2.08GB/16GB)
  | Context | 1      | 2048         | 79.79MB              | 80MB       | -210KB     (-0.26%) | 167.69MB               | 264MB        | -96.31MB  (-36.48%) | 14.59% (2.33GB/16GB)
  | Context | 1      | 1877         | 79.79MB              | 80MB       | -210KB     (-0.26%) | 164.12MB               | 264MB        | -99.88MB  (-37.83%) | 14.59% (2.33GB/16GB)
  | Context | 1      | 1706         | 79.79MB              | 80MB       | -210KB     (-0.26%) | 160.55MB               | 262MB        | -101.45MB (-38.72%) | 14.57% (2.33GB/16GB)
  | Context | 1      | 1535         | 79.79MB              | 80MB       | -210KB     (-0.26%) | 156.98MB               | 260MB        | -103.02MB (-39.62%) | 14.56% (2.33GB/16GB)
  | Context | 1      | 1364         | 79.79MB              | 80MB       | -210KB     (-0.26%) | 153.41MB               | 252MB        | -98.59MB  (-39.12%) | 14.51% (2.32GB/16GB)
  | Context | 1      | 1193         | 79.79MB              | 80MB       | -210KB     (-0.26%) | 149.84MB               | 250MB        | -100.16MB (-40.06%) | 14.50% (2.32GB/16GB)
  | Context | 1      | 1022         | 79.79MB              | 80MB       | -210KB     (-0.26%) | 146.27MB               | 248MB        | -101.73MB (-41.02%) | 14.49% (2.32GB/16GB)
  | Context | 1      | 852          | 79.79MB              | 80MB       | -210KB     (-0.26%) | 142.73MB               | 254MB        | -111.27MB (-43.81%) | 14.52% (2.32GB/16GB)
  | Context | 1      | 682          | 79.79MB              | 80MB       | -210KB     (-0.26%) | 139.18MB               | 246MB        | -106.82MB (-43.42%) | 14.48% (2.32GB/16GB)
  | Context | 1      | 512          | 79.79MB              | 80MB       | -210KB     (-0.26%) | 135.63MB               | 250MB        | -114.37MB (-45.75%) | 14.50% (2.32GB/16GB)

NPCAgent (My Fine Tuned model off of llama3.1:8b)

D:\Projects\[REDACTED]\[REDACTED]>npx --no node-llama-cpp inspect measure --gpu cuda "[REDACTED]\model.gguf"
File: [REDACTED]\model.gguf
GPU: CUDA
mmap: enabled

  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
* | Model   | 33     |              | 7.43GB               | 7.43GB     | -687.75KB  (-0.01%) |                        |              |                     | 58.91% (9.43GB/16GB)
  | Error   | 33     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 33     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 33     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 33     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 33     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 33     | 25258        | 7.43GB               | 7.43GB     | -687.75KB  (-0.01%) | 4.9GB                  | 6.57GB       | -1.67GB   (-25.43%) | 100.00% (16GB/16GB)
  | Context | 33     | 4096         | 7.43GB               | 7.43GB     | -687.75KB  (-0.01%) | 1.03GB                 | 1.22GB       | -199.98MB (-15.97%) | 66.55% (10.65GB/16GB)
  | Context | 33     | 2048         | 7.43GB               | 7.43GB     | -687.75KB  (-0.01%) | 668.02MB               | 806MB        | -137.98MB (-17.12%) | 63.83% (10.21GB/16GB)
  | Context | 33     | 1024         | 7.43GB               | 7.43GB     | -687.75KB  (-0.01%) | 476.02MB               | 602MB        | -125.98MB (-20.93%) | 62.58% (10.01GB/16GB)
  | Context | 33     | 512          | 7.43GB               | 7.43GB     | -687.75KB  (-0.01%) | 380.02MB               | 530MB        | -149.98MB (-28.30%) | 62.14% (9.94GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 32     |              | 6.91GB               | 6.91GB     | -3MB       (-0.04%) |                        |              |                     | 55.68% (8.91GB/16GB)
  | Error   | 32     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 32     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 32     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 32     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 32     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 32     | 25258        | 6.91GB               | 6.91GB     | -3MB       (-0.04%) | 4.9GB                  | 6.64GB       | -1.74GB   (-26.20%) | 97.19% (15.55GB/16GB)
  | Context | 32     | 4096         | 6.91GB               | 6.91GB     | -3MB       (-0.04%) | 1.03GB                 | 1.56GB       | -549.98MB (-34.33%) | 65.45% (10.47GB/16GB)
  | Context | 32     | 2048         | 6.91GB               | 6.91GB     | -3MB       (-0.04%) | 668.02MB               | 1.19GB       | -545.98MB (-44.97%) | 63.08% (10.09GB/16GB)
  | Context | 32     | 1024         | 6.91GB               | 6.91GB     | -3MB       (-0.04%) | 476.02MB               | 1.01GB       | -555.98MB (-53.87%) | 61.97% (9.92GB/16GB)
  | Context | 32     | 512          | 6.91GB               | 6.91GB     | -3MB       (-0.04%) | 380.02MB               | 966MB        | -585.98MB (-60.66%) | 61.57% (9.85GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 31     |              | 6.69GB               | 6.69GB     | -2.03MB    (-0.03%) |                        |              |                     | 54.32% (8.69GB/16GB)
  | Error   | 31     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 31     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 31     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 31     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 31     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 31     | 25258        | 6.69GB               | 6.69GB     | -2.03MB    (-0.03%) | 4.85GB                 | 6.59GB       | -1.74GB   (-26.39%) | 95.53% (15.28GB/16GB)
  | Context | 31     | 4096         | 6.69GB               | 6.69GB     | -2.03MB    (-0.03%) | 1.02GB                 | 1.55GB       | -542.25MB (-34.19%) | 64.00% (10.24GB/16GB)
  | Context | 31     | 2048         | 6.69GB               | 6.69GB     | -2.03MB    (-0.03%) | 663.75MB               | 1.18GB       | -542.25MB (-44.96%) | 61.68% (9.87GB/16GB)
  | Context | 31     | 1024         | 6.69GB               | 6.69GB     | -2.03MB    (-0.03%) | 473.75MB               | 1GB          | -554.25MB (-53.92%) | 60.59% (9.69GB/16GB)
  | Context | 31     | 512          | 6.69GB               | 6.69GB     | -2.03MB    (-0.03%) | 378.75MB               | 964MB        | -585.25MB (-60.71%) | 60.20% (9.63GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 30     |              | 6.48GB               | 6.48GB     | -1.06MB    (-0.02%) |                        |              |                     | 52.97% (8.47GB/16GB)
  | Error   | 30     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 30     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 30     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 30     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 30     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 30     | 25258        | 6.48GB               | 6.48GB     | -1.06MB    (-0.02%) | 4.71GB                 | 6.8GB        | -2.09GB   (-30.71%) | 95.44% (15.27GB/16GB)
  | Context | 30     | 4096         | 6.48GB               | 6.48GB     | -1.06MB    (-0.02%) | 1019.48MB              | 1.53GB       | -550.52MB (-35.06%) | 62.55% (10.01GB/16GB)
  | Context | 30     | 2048         | 6.48GB               | 6.48GB     | -1.06MB    (-0.02%) | 651.48MB               | 1.17GB       | -546.52MB (-45.62%) | 60.28% (9.64GB/16GB)
  | Context | 30     | 1024         | 6.48GB               | 6.48GB     | -1.06MB    (-0.02%) | 467.48MB               | 1GB          | -556.52MB (-54.35%) | 59.22% (9.47GB/16GB)
  | Context | 30     | 512          | 6.48GB               | 6.48GB     | -1.06MB    (-0.02%) | 375.48MB               | 962MB        | -586.52MB (-60.97%) | 58.84% (9.41GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 29     |              | 6.26GB               | 6.26GB     | -2.09MB    (-0.03%) |                        |              |                     | 51.62% (8.26GB/16GB)
  | Error   | 29     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 29     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 29     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 29     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 29     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 29     | 25258        | 6.26GB               | 6.26GB     | -2.09MB    (-0.03%) | 4.56GB                 | 6.71GB       | -2.15GB   (-32.03%) | 93.59% (14.97GB/16GB)
  | Context | 29     | 4096         | 6.26GB               | 6.26GB     | -2.09MB    (-0.03%) | 995.22MB               | 1.52GB       | -558.78MB (-35.96%) | 61.11% (9.78GB/16GB)
  | Context | 29     | 2048         | 6.26GB               | 6.26GB     | -2.09MB    (-0.03%) | 639.22MB               | 1.16GB       | -550.78MB (-46.28%) | 58.89% (9.42GB/16GB)
  | Context | 29     | 1024         | 6.26GB               | 6.26GB     | -2.09MB    (-0.03%) | 461.22MB               | 1020MB       | -558.78MB (-54.78%) | 57.85% (9.26GB/16GB)
  | Context | 29     | 512          | 6.26GB               | 6.26GB     | -2.09MB    (-0.03%) | 372.22MB               | 960MB        | -587.78MB (-61.23%) | 57.48% (9.2GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 28     |              | 6.04GB               | 6.05GB     | -3.12MB    (-0.05%) |                        |              |                     | 50.28% (8.04GB/16GB)
  | Error   | 28     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 28     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 28     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 28     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 28     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 28     | 25258        | 6.04GB               | 6.05GB     | -3.12MB    (-0.05%) | 4.42GB                 | 6.63GB       | -2.21GB   (-33.33%) | 91.71% (14.67GB/16GB)
  | Context | 28     | 4096         | 6.04GB               | 6.05GB     | -3.12MB    (-0.05%) | 970.95MB               | 1.5GB        | -567.05MB (-36.87%) | 59.67% (9.55GB/16GB)
  | Context | 28     | 2048         | 6.04GB               | 6.05GB     | -3.12MB    (-0.05%) | 626.95MB               | 1.15GB       | -555.05MB (-46.96%) | 57.49% (9.2GB/16GB)
  | Context | 28     | 1024         | 6.04GB               | 6.05GB     | -3.12MB    (-0.05%) | 454.95MB               | 1016MB       | -561.05MB (-55.22%) | 56.48% (9.04GB/16GB)
  | Context | 28     | 512          | 6.04GB               | 6.05GB     | -3.12MB    (-0.05%) | 368.95MB               | 958MB        | -589.05MB (-61.49%) | 56.13% (8.98GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 27     |              | 5.83GB               | 5.83GB     | -2.16MB    (-0.04%) |                        |              |                     | 48.92% (7.83GB/16GB)
  | Error   | 27     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 27     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 27     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 27     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 27     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 27     | 25258        | 5.83GB               | 5.83GB     | -2.16MB    (-0.04%) | 4.27GB                 | 6.44GB       | -2.16GB   (-33.58%) | 89.15% (14.26GB/16GB)
  | Context | 27     | 4096         | 5.83GB               | 5.83GB     | -2.16MB    (-0.04%) | 946.68MB               | 1.49GB       | -575.32MB (-37.80%) | 58.21% (9.31GB/16GB)
  | Context | 27     | 2048         | 5.83GB               | 5.83GB     | -2.16MB    (-0.04%) | 614.68MB               | 1.15GB       | -559.32MB (-47.64%) | 56.09% (8.97GB/16GB)
  | Context | 27     | 1024         | 5.83GB               | 5.83GB     | -2.16MB    (-0.04%) | 448.68MB               | 1012MB       | -563.32MB (-55.66%) | 55.10% (8.82GB/16GB)
  | Context | 27     | 512          | 5.83GB               | 5.83GB     | -2.16MB    (-0.04%) | 365.68MB               | 956MB        | -590.32MB (-61.75%) | 54.76% (8.76GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 26     |              | 5.61GB               | 5.61GB     | -1.19MB    (-0.02%) |                        |              |                     | 47.57% (7.61GB/16GB)
  | Error   | 26     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 26     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 26     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 26     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 26     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 26     | 25258        | 5.61GB               | 5.61GB     | -1.19MB    (-0.02%) | 4.13GB                 | 5.89GB       | -1.76GB   (-29.89%) | 84.39% (13.5GB/16GB)
  | Context | 26     | 4096         | 5.61GB               | 5.61GB     | -1.19MB    (-0.02%) | 922.42MB               | 1.47GB       | -583.58MB (-38.75%) | 56.76% (9.08GB/16GB)
  | Context | 26     | 2048         | 5.61GB               | 5.61GB     | -1.19MB    (-0.02%) | 602.42MB               | 1.14GB       | -563.58MB (-48.33%) | 54.69% (8.75GB/16GB)
  | Context | 26     | 1024         | 5.61GB               | 5.61GB     | -1.19MB    (-0.02%) | 442.42MB               | 1008MB       | -565.58MB (-56.11%) | 53.72% (8.6GB/16GB)
  | Context | 26     | 512          | 5.61GB               | 5.61GB     | -1.19MB    (-0.02%) | 362.42MB               | 954MB        | -591.58MB (-62.01%) | 53.39% (8.54GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 25     |              | 5.4GB                | 5.4GB      | -2.22MB    (-0.04%) |                        |              |                     | 46.23% (7.4GB/16GB)
  | Error   | 25     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 25     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 25     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 25     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 25     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 25     | 25258        | 5.4GB                | 5.4GB      | -2.22MB    (-0.04%) | 3.99GB                 | 5.79GB       | -1.81GB   (-31.23%) | 82.45% (13.19GB/16GB)
  | Context | 25     | 4096         | 5.4GB                | 5.4GB      | -2.22MB    (-0.04%) | 898.15MB               | 1.46GB       | -591.85MB (-39.72%) | 55.32% (8.85GB/16GB)
  | Context | 25     | 2048         | 5.4GB                | 5.4GB      | -2.22MB    (-0.04%) | 590.15MB               | 1.13GB       | -567.85MB (-49.04%) | 53.29% (8.53GB/16GB)
  | Context | 25     | 1024         | 5.4GB                | 5.4GB      | -2.22MB    (-0.04%) | 436.15MB               | 1004MB       | -567.85MB (-56.56%) | 52.35% (8.38GB/16GB)
  | Context | 25     | 512          | 5.4GB                | 5.4GB      | -2.22MB    (-0.04%) | 359.15MB               | 952MB        | -592.85MB (-62.27%) | 52.04% (8.33GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 24     |              | 5.18GB               | 5.18GB     | -3.25MB    (-0.06%) |                        |              |                     | 44.88% (7.18GB/16GB)
  | Error   | 24     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 24     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 24     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 24     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 24     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 24     | 25258        | 5.18GB               | 5.18GB     | -3.25MB    (-0.06%) | 3.84GB                 | 5.7GB        | -1.86GB   (-32.62%) | 80.50% (12.88GB/16GB)
  | Context | 24     | 4096         | 5.18GB               | 5.18GB     | -3.25MB    (-0.06%) | 873.88MB               | 1.44GB       | -600.12MB (-40.71%) | 53.88% (8.62GB/16GB)
  | Context | 24     | 2048         | 5.18GB               | 5.18GB     | -3.25MB    (-0.06%) | 577.88MB               | 1.12GB       | -572.12MB (-49.75%) | 51.90% (8.3GB/16GB)
  | Context | 24     | 1024         | 5.18GB               | 5.18GB     | -3.25MB    (-0.06%) | 429.88MB               | 1000MB       | -570.12MB (-57.01%) | 50.99% (8.16GB/16GB)
  | Context | 24     | 512          | 5.18GB               | 5.18GB     | -3.25MB    (-0.06%) | 355.88MB               | 950MB        | -594.12MB (-62.54%) | 50.68% (8.11GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 23     |              | 4.96GB               | 4.97GB     | -2.28MB    (-0.04%) |                        |              |                     | 43.53% (6.96GB/16GB)
  | Error   | 23     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 23     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 23     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 23     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 23     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 23     | 25258        | 4.96GB               | 4.97GB     | -2.28MB    (-0.04%) | 3.7GB                  | 5.6GB        | -1.91GB   (-34.05%) | 78.55% (12.57GB/16GB)
  | Context | 23     | 4096         | 4.96GB               | 4.97GB     | -2.28MB    (-0.04%) | 849.62MB               | 1.42GB       | -608.38MB (-41.73%) | 52.43% (8.39GB/16GB)
  | Context | 23     | 2048         | 4.96GB               | 4.97GB     | -2.28MB    (-0.04%) | 565.62MB               | 1.12GB       | -576.38MB (-50.47%) | 50.50% (8.08GB/16GB)
  | Context | 23     | 1024         | 4.96GB               | 4.97GB     | -2.28MB    (-0.04%) | 423.62MB               | 996MB        | -572.38MB (-57.47%) | 49.61% (7.94GB/16GB)
  | Context | 23     | 512          | 4.96GB               | 4.97GB     | -2.28MB    (-0.04%) | 352.62MB               | 948MB        | -595.38MB (-62.80%) | 49.32% (7.89GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 22     |              | 4.75GB               | 4.75GB     | -1.31MB    (-0.03%) |                        |              |                     | 42.17% (6.75GB/16GB)
  | Error   | 22     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 22     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 22     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 22     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 22     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 22     | 25258        | 4.75GB               | 4.75GB     | -1.31MB    (-0.03%) | 3.55GB                 | 5.51GB       | -1.96GB   (-35.53%) | 76.60% (12.26GB/16GB)
  | Context | 22     | 4096         | 4.75GB               | 4.75GB     | -1.31MB    (-0.03%) | 825.35MB               | 1.41GB       | -616.65MB (-42.76%) | 50.98% (8.16GB/16GB)
  | Context | 22     | 2048         | 4.75GB               | 4.75GB     | -1.31MB    (-0.03%) | 553.35MB               | 1.11GB       | -580.65MB (-51.20%) | 49.10% (7.86GB/16GB)
  | Context | 22     | 1024         | 4.75GB               | 4.75GB     | -1.31MB    (-0.03%) | 417.35MB               | 992MB        | -574.65MB (-57.93%) | 48.23% (7.72GB/16GB)
  | Context | 22     | 512          | 4.75GB               | 4.75GB     | -1.31MB    (-0.03%) | 349.35MB               | 946MB        | -596.65MB (-63.07%) | 47.95% (7.67GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 21     |              | 4.53GB               | 4.54GB     | -2.34MB    (-0.05%) |                        |              |                     | 40.83% (6.53GB/16GB)
  | Error   | 21     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 21     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 21     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 21     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 21     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 21     | 25258        | 4.53GB               | 4.54GB     | -2.34MB    (-0.05%) | 3.41GB                 | 5.41GB       | -2GB      (-37.05%) | 74.65% (11.94GB/16GB)
  | Context | 21     | 4096         | 4.53GB               | 4.54GB     | -2.34MB    (-0.05%) | 801.08MB               | 1.39GB       | -624.92MB (-43.82%) | 49.53% (7.93GB/16GB)
  | Context | 21     | 2048         | 4.53GB               | 4.54GB     | -2.34MB    (-0.05%) | 541.08MB               | 1.1GB        | -584.92MB (-51.95%) | 47.70% (7.63GB/16GB)
  | Context | 21     | 1024         | 4.53GB               | 4.54GB     | -2.34MB    (-0.05%) | 411.08MB               | 988MB        | -576.92MB (-58.39%) | 46.86% (7.5GB/16GB)
  | Context | 21     | 512          | 4.53GB               | 4.54GB     | -2.34MB    (-0.05%) | 346.08MB               | 944MB        | -597.92MB (-63.34%) | 46.59% (7.45GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 20     |              | 4.32GB               | 4.32GB     | -3.37MB    (-0.08%) |                        |              |                     | 39.49% (6.32GB/16GB)
  | Error   | 20     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 20     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 20     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 20     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 20     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 20     | 25258        | 4.32GB               | 4.32GB     | -3.37MB    (-0.08%) | 3.26GB                 | 5.31GB       | -2.05GB   (-38.61%) | 72.69% (11.63GB/16GB)
  | Context | 20     | 4096         | 4.32GB               | 4.32GB     | -3.37MB    (-0.08%) | 776.82MB               | 1.38GB       | -633.18MB (-44.91%) | 48.09% (7.69GB/16GB)
  | Context | 20     | 2048         | 4.32GB               | 4.32GB     | -3.37MB    (-0.08%) | 528.82MB               | 1.09GB       | -589.18MB (-52.70%) | 46.31% (7.41GB/16GB)
  | Context | 20     | 1024         | 4.32GB               | 4.32GB     | -3.37MB    (-0.08%) | 404.82MB               | 984MB        | -579.18MB (-58.86%) | 45.49% (7.28GB/16GB)
  | Context | 20     | 512          | 4.32GB               | 4.32GB     | -3.37MB    (-0.08%) | 342.82MB               | 942MB        | -599.18MB (-63.61%) | 45.24% (7.24GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 19     |              | 4.1GB                | 4.1GB      | -2.41MB    (-0.06%) |                        |              |                     | 38.13% (6.1GB/16GB)
  | Error   | 19     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 19     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 19     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 19     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 19     | 50653        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 19     | 25258        | 4.1GB                | 4.1GB      | -2.41MB    (-0.06%) | 3.12GB                 | 5.22GB       | -2.1GB    (-40.26%) | 70.74% (11.32GB/16GB)
  | Context | 19     | 4096         | 4.1GB                | 4.1GB      | -2.41MB    (-0.06%) | 752.55MB               | 1.36GB       | -641.45MB (-46.02%) | 46.64% (7.46GB/16GB)
  | Context | 19     | 2048         | 4.1GB                | 4.1GB      | -2.41MB    (-0.06%) | 516.55MB               | 1.08GB       | -593.45MB (-53.46%) | 44.91% (7.19GB/16GB)
  | Context | 19     | 1024         | 4.1GB                | 4.1GB      | -2.41MB    (-0.06%) | 398.55MB               | 980MB        | -581.45MB (-59.33%) | 44.11% (7.06GB/16GB)
  | Context | 19     | 512          | 4.1GB                | 4.1GB      | -2.41MB    (-0.06%) | 339.55MB               | 940MB        | -600.45MB (-63.88%) | 43.87% (7.02GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 18     |              | 3.89GB               | 3.89GB     | -1.44MB    (-0.04%) |                        |              |                     | 36.78% (5.88GB/16GB)
  | Error   | 18     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 18     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 18     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 18     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 18     | 50653        | 3.89GB               | 3.89GB     | -1.44MB    (-0.04%) | 5.68GB                 | 10GB         | -4.31GB   (-43.14%) | 99.26% (15.88GB/16GB)
  | Context | 18     | 25258        | 3.89GB               | 3.89GB     | -1.44MB    (-0.04%) | 2.97GB                 | 5.12GB       | -2.15GB   (-41.97%) | 68.79% (11.01GB/16GB)
  | Context | 18     | 4096         | 3.89GB               | 3.89GB     | -1.44MB    (-0.04%) | 728.28MB               | 1.35GB       | -649.72MB (-47.15%) | 45.19% (7.23GB/16GB)
  | Context | 18     | 2048         | 3.89GB               | 3.89GB     | -1.44MB    (-0.04%) | 504.28MB               | 1.08GB       | -597.72MB (-54.24%) | 43.50% (6.96GB/16GB)
  | Context | 18     | 1024         | 3.89GB               | 3.89GB     | -1.44MB    (-0.04%) | 392.28MB               | 976MB        | -583.72MB (-59.81%) | 42.74% (6.84GB/16GB)
  | Context | 18     | 512          | 3.89GB               | 3.89GB     | -1.44MB    (-0.04%) | 336.28MB               | 938MB        | -601.72MB (-64.15%) | 42.50% (6.8GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 17     |              | 3.67GB               | 3.67GB     | -2.47MB    (-0.07%) |                        |              |                     | 35.44% (5.67GB/16GB)
  | Error   | 17     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 17     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 17     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 17     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 17     | 50653        | 3.67GB               | 3.67GB     | -2.47MB    (-0.07%) | 5.39GB                 | 9.8GB        | -4.41GB   (-44.98%) | 96.70% (15.47GB/16GB)
  | Context | 17     | 25258        | 3.67GB               | 3.67GB     | -2.47MB    (-0.07%) | 2.83GB                 | 5.02GB       | -2.2GB    (-43.73%) | 66.83% (10.69GB/16GB)
  | Context | 17     | 4096         | 3.67GB               | 3.67GB     | -2.47MB    (-0.07%) | 704.02MB               | 1.33GB       | -657.98MB (-48.31%) | 43.75% (7GB/16GB)
  | Context | 17     | 2048         | 3.67GB               | 3.67GB     | -2.47MB    (-0.07%) | 492.02MB               | 1.07GB       | -601.98MB (-55.03%) | 42.11% (6.74GB/16GB)
  | Context | 17     | 1024         | 3.67GB               | 3.67GB     | -2.47MB    (-0.07%) | 386.02MB               | 972MB        | -585.98MB (-60.29%) | 41.37% (6.62GB/16GB)
  | Context | 17     | 512          | 3.67GB               | 3.67GB     | -2.47MB    (-0.07%) | 333.02MB               | 936MB        | -602.98MB (-64.42%) | 41.15% (6.58GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 16     |              | 3.45GB               | 3.46GB     | -3.5MB     (-0.10%) |                        |              |                     | 34.09% (5.45GB/16GB)
  | Error   | 16     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 16     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 16     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 16     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 16     | 50653        | 3.45GB               | 3.46GB     | -3.5MB     (-0.10%) | 5.1GB                  | 9.82GB       | -4.71GB   (-48.01%) | 95.45% (15.27GB/16GB)
  | Context | 16     | 25258        | 3.45GB               | 3.46GB     | -3.5MB     (-0.10%) | 2.68GB                 | 4.93GB       | -2.24GB   (-45.55%) | 64.88% (10.38GB/16GB)
  | Context | 16     | 4096         | 3.45GB               | 3.46GB     | -3.5MB     (-0.10%) | 679.75MB               | 1.31GB       | -666.25MB (-49.50%) | 42.31% (6.77GB/16GB)
  | Context | 16     | 2048         | 3.45GB               | 3.46GB     | -3.5MB     (-0.10%) | 479.75MB               | 1.06GB       | -606.25MB (-55.82%) | 40.72% (6.52GB/16GB)
  | Context | 16     | 1024         | 3.45GB               | 3.46GB     | -3.5MB     (-0.10%) | 379.75MB               | 968MB        | -588.25MB (-60.77%) | 40.00% (6.4GB/16GB)
  | Context | 16     | 512          | 3.45GB               | 3.46GB     | -3.5MB     (-0.10%) | 329.75MB               | 934MB        | -604.25MB (-64.69%) | 39.79% (6.37GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 15     |              | 3.24GB               | 3.24GB     | -2.53MB    (-0.08%) |                        |              |                     | 32.74% (5.24GB/16GB)
  | Error   | 15     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 15     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 15     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 15     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 15     | 50653        | 3.24GB               | 3.24GB     | -2.53MB    (-0.08%) | 4.81GB                 | 9.62GB       | -4.81GB   (-49.98%) | 92.88% (14.86GB/16GB)
  | Context | 15     | 25258        | 3.24GB               | 3.24GB     | -2.53MB    (-0.08%) | 2.54GB                 | 4.83GB       | -2.29GB   (-47.47%) | 62.93% (10.07GB/16GB)
  | Context | 15     | 4096         | 3.24GB               | 3.24GB     | -2.53MB    (-0.08%) | 655.48MB               | 1.3GB        | -674.52MB (-50.72%) | 40.86% (6.54GB/16GB)
  | Context | 15     | 2048         | 3.24GB               | 3.24GB     | -2.53MB    (-0.08%) | 467.48MB               | 1.05GB       | -610.52MB (-56.63%) | 39.32% (6.29GB/16GB)
  | Context | 15     | 1024         | 3.24GB               | 3.24GB     | -2.53MB    (-0.08%) | 373.48MB               | 964MB        | -590.52MB (-61.26%) | 38.62% (6.18GB/16GB)
  | Context | 15     | 512          | 3.24GB               | 3.24GB     | -2.53MB    (-0.08%) | 326.48MB               | 932MB        | -605.52MB (-64.97%) | 38.43% (6.15GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 14     |              | 3.02GB               | 3.02GB     | -1.56MB    (-0.05%) |                        |              |                     | 31.38% (5.02GB/16GB)
  | Error   | 14     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 14     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 14     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 14     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 14     | 50653        | 3.02GB               | 3.02GB     | -1.56MB    (-0.05%) | 4.52GB                 | 9.43GB       | -4.91GB   (-52.03%) | 90.32% (14.45GB/16GB)
  | Context | 14     | 25258        | 3.02GB               | 3.02GB     | -1.56MB    (-0.05%) | 2.39GB                 | 4.73GB       | -2.34GB   (-49.47%) | 60.97% (9.76GB/16GB)
  | Context | 14     | 4096         | 3.02GB               | 3.02GB     | -1.56MB    (-0.05%) | 631.22MB               | 1.28GB       | -682.78MB (-51.96%) | 39.40% (6.3GB/16GB)
  | Context | 14     | 2048         | 3.02GB               | 3.02GB     | -1.56MB    (-0.05%) | 455.22MB               | 1.04GB       | -614.78MB (-57.46%) | 37.91% (6.07GB/16GB)
  | Context | 14     | 1024         | 3.02GB               | 3.02GB     | -1.56MB    (-0.05%) | 367.22MB               | 960MB        | -592.78MB (-61.75%) | 37.24% (5.96GB/16GB)
  | Context | 14     | 512          | 3.02GB               | 3.02GB     | -1.56MB    (-0.05%) | 323.22MB               | 930MB        | -606.78MB (-65.25%) | 37.06% (5.93GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 13     |              | 2.81GB               | 2.81GB     | -2.59MB    (-0.09%) |                        |              |                     | 30.04% (4.81GB/16GB)
  | Error   | 13     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 13     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 13     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 13     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 13     | 50653        | 2.81GB               | 2.81GB     | -2.59MB    (-0.09%) | 4.23GB                 | 9.24GB       | -5GB      (-54.16%) | 87.77% (14.04GB/16GB)
  | Context | 13     | 25258        | 2.81GB               | 2.81GB     | -2.59MB    (-0.09%) | 2.25GB                 | 4.64GB       | -2.39GB   (-51.55%) | 59.03% (9.44GB/16GB)
  | Context | 13     | 4096         | 2.81GB               | 2.81GB     | -2.59MB    (-0.09%) | 606.95MB               | 1.27GB       | -691.05MB (-53.24%) | 37.96% (6.07GB/16GB)
  | Context | 13     | 2048         | 2.81GB               | 2.81GB     | -2.59MB    (-0.09%) | 442.95MB               | 1.04GB       | -619.05MB (-58.29%) | 36.52% (5.84GB/16GB)
  | Context | 13     | 1024         | 2.81GB               | 2.81GB     | -2.59MB    (-0.09%) | 360.95MB               | 956MB        | -595.05MB (-62.24%) | 35.87% (5.74GB/16GB)
  | Context | 13     | 512          | 2.81GB               | 2.81GB     | -2.59MB    (-0.09%) | 319.95MB               | 928MB        | -608.05MB (-65.52%) | 35.70% (5.71GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 12     |              | 2.59GB               | 2.59GB     | -3.62MB    (-0.14%) |                        |              |                     | 28.70% (4.59GB/16GB)
  | Error   | 12     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 12     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 12     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 12     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 12     | 50653        | 2.59GB               | 2.59GB     | -3.62MB    (-0.14%) | 3.94GB                 | 9.04GB       | -5.1GB    (-56.39%) | 85.22% (13.63GB/16GB)
  | Context | 12     | 25258        | 2.59GB               | 2.59GB     | -3.62MB    (-0.14%) | 2.1GB                  | 4.54GB       | -2.44GB   (-53.71%) | 57.09% (9.13GB/16GB)
  | Context | 12     | 4096         | 2.59GB               | 2.59GB     | -3.62MB    (-0.14%) | 582.68MB               | 1.25GB       | -699.32MB (-54.55%) | 36.52% (5.84GB/16GB)
  | Context | 12     | 2048         | 2.59GB               | 2.59GB     | -3.62MB    (-0.14%) | 430.68MB               | 1.03GB       | -623.32MB (-59.14%) | 35.13% (5.62GB/16GB)
  | Context | 12     | 1024         | 2.59GB               | 2.59GB     | -3.62MB    (-0.14%) | 354.68MB               | 952MB        | -597.32MB (-62.74%) | 34.51% (5.52GB/16GB)
  | Context | 12     | 512          | 2.59GB               | 2.59GB     | -3.62MB    (-0.14%) | 316.68MB               | 926MB        | -609.32MB (-65.80%) | 34.35% (5.5GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 11     |              | 2.37GB               | 2.38GB     | -2.66MB    (-0.11%) |                        |              |                     | 27.34% (4.37GB/16GB)
  | Error   | 11     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 11     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 11     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 11     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 11     | 50653        | 2.37GB               | 2.38GB     | -2.66MB    (-0.11%) | 3.65GB                 | 8.85GB       | -5.2GB    (-58.72%) | 82.65% (13.22GB/16GB)
  | Context | 11     | 25258        | 2.37GB               | 2.38GB     | -2.66MB    (-0.11%) | 1.96GB                 | 4.45GB       | -2.49GB   (-55.95%) | 55.13% (8.82GB/16GB)
  | Context | 11     | 4096         | 2.37GB               | 2.38GB     | -2.66MB    (-0.11%) | 558.42MB               | 1.24GB       | -707.58MB (-55.89%) | 35.07% (5.61GB/16GB)
  | Context | 11     | 2048         | 2.37GB               | 2.38GB     | -2.66MB    (-0.11%) | 418.42MB               | 1.02GB       | -627.58MB (-60.00%) | 33.73% (5.4GB/16GB)
  | Context | 11     | 1024         | 2.37GB               | 2.38GB     | -2.66MB    (-0.11%) | 348.42MB               | 948MB        | -599.58MB (-63.25%) | 33.13% (5.3GB/16GB)
  | Context | 11     | 512          | 2.37GB               | 2.38GB     | -2.66MB    (-0.11%) | 313.42MB               | 924MB        | -610.58MB (-66.08%) | 32.98% (5.28GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 10     |              | 2.16GB               | 2.16GB     | -1.69MB    (-0.08%) |                        |              |                     | 25.99% (4.16GB/16GB)
  | Error   | 10     | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 10     | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 10     | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 10     | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 10     | 50653        | 2.16GB               | 2.16GB     | -1.69MB    (-0.08%) | 3.36GB                 | 8.66GB       | -5.29GB   (-61.15%) | 80.09% (12.81GB/16GB)
  | Context | 10     | 25258        | 2.16GB               | 2.16GB     | -1.69MB    (-0.08%) | 1.81GB                 | 4.35GB       | -2.53GB   (-58.29%) | 53.16% (8.51GB/16GB)
  | Context | 10     | 4096         | 2.16GB               | 2.16GB     | -1.69MB    (-0.08%) | 534.15MB               | 1.22GB       | -715.85MB (-57.27%) | 33.62% (5.38GB/16GB)
  | Context | 10     | 2048         | 2.16GB               | 2.16GB     | -1.69MB    (-0.08%) | 406.15MB               | 1.01GB       | -631.85MB (-60.87%) | 32.32% (5.17GB/16GB)
  | Context | 10     | 1024         | 2.16GB               | 2.16GB     | -1.69MB    (-0.08%) | 342.15MB               | 944MB        | -601.85MB (-63.76%) | 31.75% (5.08GB/16GB)
  | Context | 10     | 512          | 2.16GB               | 2.16GB     | -1.69MB    (-0.08%) | 310.15MB               | 922MB        | -611.85MB (-66.36%) | 31.61% (5.06GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 9      |              | 1.94GB               | 1.95GB     | -2.72MB    (-0.14%) |                        |              |                     | 24.64% (3.94GB/16GB)
  | Error   | 9      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 9      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 9      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 9      | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 9      | 50653        | 1.94GB               | 1.95GB     | -2.72MB    (-0.14%) | 3.07GB                 | 8.46GB       | -5.39GB   (-63.69%) | 77.54% (12.41GB/16GB)
  | Context | 9      | 25258        | 1.94GB               | 1.95GB     | -2.72MB    (-0.14%) | 1.67GB                 | 4.25GB       | -2.58GB   (-60.76%) | 51.22% (8.19GB/16GB)
  | Context | 9      | 4096         | 1.94GB               | 1.95GB     | -2.72MB    (-0.14%) | 509.88MB               | 1.21GB       | -724.12MB (-58.68%) | 32.18% (5.15GB/16GB)
  | Context | 9      | 2048         | 1.94GB               | 1.95GB     | -2.72MB    (-0.14%) | 393.88MB               | 1.01GB       | -636.12MB (-61.76%) | 30.93% (4.95GB/16GB)
  | Context | 9      | 1024         | 1.94GB               | 1.95GB     | -2.72MB    (-0.14%) | 335.88MB               | 940MB        | -604.12MB (-64.27%) | 30.38% (4.86GB/16GB)
  | Context | 9      | 512          | 1.94GB               | 1.95GB     | -2.72MB    (-0.14%) | 306.88MB               | 920MB        | -613.12MB (-66.64%) | 30.26% (4.84GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 8      |              | 1.73GB               | 1.73GB     | -3.75MB    (-0.21%) |                        |              |                     | 23.30% (3.73GB/16GB)
  | Error   | 8      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 8      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 8      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 8      | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 8      | 50653        | 1.73GB               | 1.73GB     | -3.75MB    (-0.21%) | 2.78GB                 | 8.27GB       | -5.49GB   (-66.35%) | 74.99% (12GB/16GB)
  | Context | 8      | 25258        | 1.73GB               | 1.73GB     | -3.75MB    (-0.21%) | 1.52GB                 | 4.16GB       | -2.63GB   (-63.34%) | 49.28% (7.88GB/16GB)
  | Context | 8      | 4096         | 1.73GB               | 1.73GB     | -3.75MB    (-0.21%) | 485.62MB               | 1.19GB       | -732.38MB (-60.13%) | 30.74% (4.92GB/16GB)
  | Context | 8      | 2048         | 1.73GB               | 1.73GB     | -3.75MB    (-0.21%) | 381.62MB               | 1022MB       | -640.38MB (-62.66%) | 29.54% (4.73GB/16GB)
  | Context | 8      | 1024         | 1.73GB               | 1.73GB     | -3.75MB    (-0.21%) | 329.62MB               | 936MB        | -606.38MB (-64.78%) | 29.01% (4.64GB/16GB)
  | Context | 8      | 512          | 1.73GB               | 1.73GB     | -3.75MB    (-0.21%) | 303.62MB               | 918MB        | -614.38MB (-66.93%) | 28.90% (4.62GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 7      |              | 1.51GB               | 1.51GB     | -2.78MB    (-0.18%) |                        |              |                     | 21.95% (3.51GB/16GB)
  | Error   | 7      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 7      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 7      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 7      | 82397        | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 7      | 50653        | 1.51GB               | 1.51GB     | -2.78MB    (-0.18%) | 2.49GB                 | 8.08GB       | -5.58GB   (-69.13%) | 72.42% (11.59GB/16GB)
  | Context | 7      | 25258        | 1.51GB               | 1.51GB     | -2.78MB    (-0.18%) | 1.38GB                 | 4.06GB       | -2.68GB   (-66.04%) | 47.33% (7.57GB/16GB)
  | Context | 7      | 4096         | 1.51GB               | 1.51GB     | -2.78MB    (-0.18%) | 461.35MB               | 1.17GB       | -740.65MB (-61.62%) | 29.28% (4.69GB/16GB)
  | Context | 7      | 2048         | 1.51GB               | 1.51GB     | -2.78MB    (-0.18%) | 369.35MB               | 1014MB       | -644.65MB (-63.57%) | 28.14% (4.5GB/16GB)
  | Context | 7      | 1024         | 1.51GB               | 1.51GB     | -2.78MB    (-0.18%) | 323.35MB               | 932MB        | -608.65MB (-65.31%) | 27.63% (4.42GB/16GB)
  | Context | 7      | 512          | 1.51GB               | 1.51GB     | -2.78MB    (-0.18%) | 300.35MB               | 916MB        | -615.65MB (-67.21%) | 27.54% (4.41GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 6      |              | 1.3GB                | 1.3GB      | -1.81MB    (-0.14%) |                        |              |                     | 20.59% (3.29GB/16GB)
  | Error   | 6      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 6      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 6      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 6      | 82397        | 1.3GB                | 1.3GB      | -1.81MB    (-0.14%) | 3.41GB                 | 12.49GB      | -9.08GB   (-72.67%) | 98.67% (15.79GB/16GB)
  | Context | 6      | 50653        | 1.3GB                | 1.3GB      | -1.81MB    (-0.14%) | 2.2GB                  | 7.88GB       | -5.68GB   (-72.06%) | 69.86% (11.18GB/16GB)
  | Context | 6      | 25258        | 1.3GB                | 1.3GB      | -1.81MB    (-0.14%) | 1.23GB                 | 3.96GB       | -2.73GB   (-68.87%) | 45.37% (7.26GB/16GB)
  | Context | 6      | 4096         | 1.3GB                | 1.3GB      | -1.81MB    (-0.14%) | 437.08MB               | 1.16GB       | -748.92MB (-63.15%) | 27.83% (4.45GB/16GB)
  | Context | 6      | 2048         | 1.3GB                | 1.3GB      | -1.81MB    (-0.14%) | 357.08MB               | 1006MB       | -648.92MB (-64.50%) | 26.73% (4.28GB/16GB)
  | Context | 6      | 1024         | 1.3GB                | 1.3GB      | -1.81MB    (-0.14%) | 317.08MB               | 928MB        | -610.92MB (-65.83%) | 26.26% (4.2GB/16GB)
  | Context | 6      | 512          | 1.3GB                | 1.3GB      | -1.81MB    (-0.14%) | 297.08MB               | 914MB        | -616.92MB (-67.50%) | 26.17% (4.19GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 5      |              | 1.08GB               | 1.08GB     | -2.84MB    (-0.26%) |                        |              |                     | 19.25% (3.08GB/16GB)
  | Error   | 5      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 5      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 5      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 5      | 82397        | 1.08GB               | 1.08GB     | -2.84MB    (-0.26%) | 2.94GB                 | 12.18GB      | -9.24GB   (-75.84%) | 95.36% (15.26GB/16GB)
  | Context | 5      | 50653        | 1.08GB               | 1.08GB     | -2.84MB    (-0.26%) | 1.91GB                 | 7.69GB       | -5.78GB   (-75.13%) | 67.31% (10.77GB/16GB)
  | Context | 5      | 25258        | 1.08GB               | 1.08GB     | -2.84MB    (-0.26%) | 1.09GB                 | 3.87GB       | -2.78GB   (-71.83%) | 43.42% (6.95GB/16GB)
  | Context | 5      | 4096         | 1.08GB               | 1.08GB     | -2.84MB    (-0.26%) | 412.82MB               | 1.14GB       | -757.18MB (-64.72%) | 26.39% (4.22GB/16GB)
  | Context | 5      | 2048         | 1.08GB               | 1.08GB     | -2.84MB    (-0.26%) | 344.82MB               | 998MB        | -653.18MB (-65.45%) | 25.34% (4.05GB/16GB)
  | Context | 5      | 1024         | 1.08GB               | 1.08GB     | -2.84MB    (-0.26%) | 310.82MB               | 924MB        | -613.18MB (-66.36%) | 24.89% (3.98GB/16GB)
  | Context | 5      | 512          | 1.08GB               | 1.08GB     | -2.84MB    (-0.26%) | 293.82MB               | 912MB        | -618.18MB (-67.78%) | 24.81% (3.97GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 4      |              | 884.13MB             | 888MB      | -3.87MB    (-0.44%) |                        |              |                     | 17.91% (2.86GB/16GB)
  | Error   | 4      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 4      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 4      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 4      | 82397        | 884.13MB             | 888MB      | -3.87MB    (-0.44%) | 2.47GB                 | 12.16GB      | -9.69GB   (-79.68%) | 93.90% (15.02GB/16GB)
  | Context | 4      | 50653        | 884.13MB             | 888MB      | -3.87MB    (-0.44%) | 1.62GB                 | 7.5GB        | -5.87GB   (-78.35%) | 64.76% (10.36GB/16GB)
  | Context | 4      | 25258        | 884.13MB             | 888MB      | -3.87MB    (-0.44%) | 967.2MB                | 3.77GB       | -2.83GB   (-74.94%) | 41.47% (6.63GB/16GB)
  | Context | 4      | 4096         | 884.13MB             | 888MB      | -3.87MB    (-0.44%) | 388.55MB               | 1.13GB       | -765.45MB (-66.33%) | 24.95% (3.99GB/16GB)
  | Context | 4      | 2048         | 884.13MB             | 888MB      | -3.87MB    (-0.44%) | 332.55MB               | 990MB        | -657.45MB (-66.41%) | 23.95% (3.83GB/16GB)
  | Context | 4      | 1024         | 884.13MB             | 888MB      | -3.87MB    (-0.44%) | 304.55MB               | 920MB        | -615.45MB (-66.90%) | 23.52% (3.76GB/16GB)
  | Context | 4      | 512          | 884.13MB             | 888MB      | -3.87MB    (-0.44%) | 290.55MB               | 910MB        | -619.45MB (-68.07%) | 23.46% (3.75GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 3      |              | 663.09MB             | 666MB      | -2.91MB    (-0.44%) |                        |              |                     | 16.55% (2.65GB/16GB)
  | Error   | 3      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 3      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 3      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 3      | 82397        | 663.09MB             | 666MB      | -2.91MB    (-0.44%) | 2GB                    | 11.84GB      | -9.85GB   (-83.13%) | 90.58% (14.49GB/16GB)
  | Context | 3      | 50653        | 663.09MB             | 666MB      | -2.91MB    (-0.44%) | 1.33GB                 | 7.3GB        | -5.97GB   (-81.75%) | 62.19% (9.95GB/16GB)
  | Context | 3      | 25258        | 663.09MB             | 666MB      | -2.91MB    (-0.44%) | 818.94MB               | 3.67GB       | -2.87GB   (-78.23%) | 39.51% (6.32GB/16GB)
  | Context | 3      | 4096         | 663.09MB             | 666MB      | -2.91MB    (-0.44%) | 364.28MB               | 1.11GB       | -773.72MB (-67.99%) | 23.50% (3.76GB/16GB)
  | Context | 3      | 2048         | 663.09MB             | 666MB      | -2.91MB    (-0.44%) | 320.28MB               | 982MB        | -661.72MB (-67.38%) | 22.54% (3.61GB/16GB)
  | Context | 3      | 1024         | 663.09MB             | 666MB      | -2.91MB    (-0.44%) | 298.28MB               | 916MB        | -617.72MB (-67.44%) | 22.14% (3.54GB/16GB)
  | Context | 3      | 512          | 663.09MB             | 666MB      | -2.91MB    (-0.44%) | 287.28MB               | 908MB        | -620.72MB (-68.36%) | 22.09% (3.53GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 2      |              | 442.06MB             | 444MB      | -1.94MB    (-0.44%) |                        |              |                     | 15.20% (2.43GB/16GB)
  | Error   | 2      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 2      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 2      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 2      | 82397        | 442.06MB             | 444MB      | -1.94MB    (-0.44%) | 1.53GB                 | 11.53GB      | -10GB     (-86.76%) | 87.26% (13.96GB/16GB)
  | Context | 2      | 50653        | 442.06MB             | 444MB      | -1.94MB    (-0.44%) | 1.04GB                 | 7.11GB       | -6.07GB   (-85.34%) | 59.63% (9.54GB/16GB)
  | Context | 2      | 25258        | 442.06MB             | 444MB      | -1.94MB    (-0.44%) | 670.67MB               | 3.58GB       | -2.92GB   (-81.70%) | 37.56% (6.01GB/16GB)
  | Context | 2      | 4096         | 442.06MB             | 444MB      | -1.94MB    (-0.44%) | 340.02MB               | 1.1GB        | -781.98MB (-69.70%) | 22.04% (3.53GB/16GB)
  | Context | 2      | 2048         | 442.06MB             | 444MB      | -1.94MB    (-0.44%) | 308.02MB               | 974MB        | -665.98MB (-68.38%) | 21.14% (3.38GB/16GB)
  | Context | 2      | 1024         | 442.06MB             | 444MB      | -1.94MB    (-0.44%) | 292.02MB               | 912MB        | -619.98MB (-67.98%) | 20.76% (3.32GB/16GB)
  | Context | 2      | 512          | 442.06MB             | 444MB      | -1.94MB    (-0.44%) | 284.02MB               | 906MB        | -621.98MB (-68.65%) | 20.73% (3.32GB/16GB)
  |         |        |              |                      |            |                     |                        |              |                     |
  | Type    | Layers | Context size | Estimated model VRAM | Model VRAM | Diff                | Estimated context VRAM | Context VRAM | Diff                | VRAM usage
  | Model   | 1      |              | 221.03MB             | 222MB      | -991.75KB  (-0.44%) |                        |              |                     | 13.84% (2.21GB/16GB)
  | Error   | 1      | Error: Failed to create context     |            |                     |                        |              |                     |
  | Error   | 1      | 131072       | Error: Failed to create context   |                     |                        |              |                     |
  | Error   | 1      | 124722       | Error: Failed to create context   |                     |                        |              |                     |
  | Context | 1      | 82397        | 221.03MB             | 222MB      | -991.75KB  (-0.44%) | 1.06GB                 | 6GB          | -4.94GB   (-82.40%) | 51.32% (8.21GB/16GB)
  | Context | 1      | 50653        | 221.03MB             | 222MB      | -991.75KB  (-0.44%) | 770.41MB               | 3.7GB        | -2.94GB   (-79.64%) | 36.94% (5.91GB/16GB)
  | Context | 1      | 25258        | 221.03MB             | 222MB      | -991.75KB  (-0.44%) | 522.41MB               | 1.86GB       | -1.35GB   (-72.53%) | 25.45% (4.07GB/16GB)
  | Context | 1      | 4096         | 221.03MB             | 222MB      | -991.75KB  (-0.44%) | 315.75MB               | 808MB        | -492.25MB (-60.92%) | 18.77% (3GB/16GB)
  | Context | 1      | 2048         | 221.03MB             | 222MB      | -991.75KB  (-0.44%) | 295.75MB               | 800MB        | -504.25MB (-63.03%) | 18.72% (3GB/16GB)
  | Context | 1      | 1024         | 221.03MB             | 222MB      | -991.75KB  (-0.44%) | 285.75MB               | 796MB        | -510.25MB (-64.10%) | 18.70% (2.99GB/16GB)
  | Context | 1      | 512          | 221.03MB             | 222MB      | -991.75KB  (-0.44%) | 280.75MB               | 794MB        | -513.25MB (-64.64%) | 18.69% (2.99GB/16GB)

SpicyMelonYT Jul 15, 2025
Author

As for 2. I know those will not lag. I have ran them enough to where I don't see lag besides I think one time yesterday. I experience the lag in the actual v1 app I made before. The v2 app is what I was using the child process with but the old one was with the worker thread method. Since the information is under NDA I can't tell you specifics. But what I have noticed is that when I select a model (I chose phi 2 right now) There is a 40 or so percent change that it will freeze not only the app but my computer. Lasts about 10 seconds. I cant click on anything but I still see the mouse move. The task manager shows that the memory hump didn't even get close to using all my ram, I had about 35% of my memory available still.

As I am somewhat strapped for time, and I think if I just use the main process without worker threads or child process, as you said it should work fine. I have learned a lot in this conversation so I feel that the need for figuring out how to prevent lag does not out weight the immediate fixes that I can setup in my app. Like the context size thing might be a quick fix I can do although I know that each users computer will be different! So it seems I can only help you with 1. for now!

Also so im my experience as stated in the other tests, setting the context size limit made it so it loaded without allocation errors, and was able to dispose of correctly. So Maybe you can see about making it so that if a model does not allocate correctly it will still be disposable?

I am talking about these messages as they are the only indicator to my small little mind on when it loads a model incorrectly:

[node-llama-cpp] llama_kv_cache_unified: LLAMA_SET_ROWS=0, using old ggml_cpy() method for backwards compatibility
[node-llama-cpp] ggml_backend_cuda_buffer_type_alloc_buffer: allocating 2113.76 MiB on device 0: cudaMalloc failed: out of memory
[node-llama-cpp] ggml_gallocr_reserve_n: failed to allocate CUDA0 buffer of size 2216435712
[node-llama-cpp] graph_reserve: failed to allocate compute buffers
[node-llama-cpp] llama_init_from_model: failed to initialize the context: failed to allocate compute pp buffers
[node-llama-cpp] llama_kv_cache_unified: LLAMA_SET_ROWS=0, using old ggml_cpy() method for backwards compatibility

I am also curious about why there is a dispose function on the llama instance. Do I just called this.model.dispose() or should I call both?

  async unloadModel() {
    if (this.model != null) {
      await this.model.dispose();
      await this.llama.dispose();
    }

    this.llama = null;
    this.model = null;
    this.context = null;
    this.session = null;
  }

Please feel free to message me directly on github, perhaps we can even chat on discord in the future!

giladgd Jul 16, 2025
Maintainer

I see now that the memory usage estimation for a context with CUDA on your machine is off by too much, which seems to be the reason why it fails to get created without limiting the context size.
I'll try to tweak the memory estimation algorithm a bit more to make it more accurate with CUDA.

The log messages you saw about CUDA actually come from the native code of llama.cpp, so the issue with the disposal of unsuccessful model loads seems to be related to the CUDA backend implementation of llama.cpp.
I'll try to debug it with windbg on a machine with CUDA when I have enough free time, to ensure that unsuccessful model loads actually throw an error and deallocate automatically instead of entering this semi-state of failed allocation that you can't dispose. The fix will be on llama.cpp though.

Regarding the freeze you described, I think this happened since it tried to allocate all the memory that the GPU has and didn't leave space for anything else, which caused the system to try and unload things from the GPU and resulted in the freeze you experienced.
node-llama-cpp automatically measures the resource usage of the GPU to ensure it doesn't consume more than the available free VRAM, and actually has a default padding of 6% or 1GB (whichever is lower, but you can adjust this) to ensure it doesn't overload the system, but it seems to fail in this case since the actual VRAM consumption was way higher than estimated.

Did the phi2 model also cause a freeze on your machine? If you disable mmap does it still freeze?
I think the freeze is related to the inaccurate resource usage estimation, but it can also be related to mmap, or maybe even I/O bandwidth clogging.

You can call .dispose() on any object and it'll dispose all of its dependents and itself, so if you dispose the Llama instance then it'll dispose all the contexts and models created from it and then dispose the Llama instance and the GPU backend it uses.
You can read more about it in the Objects Lifecycle guide

HMU on email so we can exchange Discord handles if you like!

SpicyMelonYT Jul 16, 2025
Author

Awesome thank you for the insights!

The phi2 model does freeze sometimes. But I just tried to replicate it and it didn't. I am still trying to figure out when and how it freezes as its seemingly random. So when I

The padding thing sounds useful thank you for that!

As for the mmap for phi here is the results:

DISABLED

Running test: test-mmap-disabled
=== Test: MMAP Disabled ===
This test disables mmap (useMmap: false)
Compare this with the enabled mmap test

1. Creating Llama instance...
   ✓ Llama instance created in 1047ms

2. Loading model with mmap disabled...
[node-llama-cpp] load: missing pre-tokenizer type, using: 'default'
[node-llama-cpp] load:
[node-llama-cpp] load: ************************************
[node-llama-cpp] load: GENERATION QUALITY WILL BE DEGRADED!
[node-llama-cpp] load: CONSIDER REGENERATING THE MODEL
[node-llama-cpp] load: ************************************
[node-llama-cpp] load:
   ✓ Model loaded in 958ms

3. Creating context...
[node-llama-cpp] llama_kv_cache_unified: LLAMA_SET_ROWS=0, using old ggml_cpy() method for backwards compatibility
   ✓ Context created in 40ms
   Context size: 2048

4. Creating chat session...
   ✓ Chat session created in 2ms

5. Waiting 2 seconds...

6. Disposing model...
   ✓ Model disposed in 43ms

=== Test completed successfully in 4101ms ===
Compare this with the enabled mmap test

Test completed successfully

Press Enter to continue...

ENABLED

Running test: test-mmap-enabled
=== Test: MMAP Enabled (Default) ===
This test uses mmap enabled (default setting)
Compare this with the disabled mmap test

1. Creating Llama instance...
   ✓ Llama instance created in 795ms

2. Loading model with mmap enabled (default)...
[node-llama-cpp] load: missing pre-tokenizer type, using: 'default'
[node-llama-cpp] load:
[node-llama-cpp] load: ************************************
[node-llama-cpp] load: GENERATION QUALITY WILL BE DEGRADED!
[node-llama-cpp] load: CONSIDER REGENERATING THE MODEL
[node-llama-cpp] load: ************************************
[node-llama-cpp] load:
   ✓ Model loaded in 1347ms

3. Creating context...
[node-llama-cpp] llama_kv_cache_unified: LLAMA_SET_ROWS=0, using old ggml_cpy() method for backwards compatibility
   ✓ Context created in 41ms
   Context size: 2048

4. Creating chat session...
   ✓ Chat session created in 1ms

5. Waiting 2 seconds...

6. Disposing model...
   ✓ Model disposed in 420ms

=== Test completed successfully in 4606ms ===
Compare this with the disabled mmap test

Test completed successfully

Press Enter to continue...

I am not experiencing and lag for both of these as of right now.

Also yes I will email you now thank you!

Uh oh!

Using Node LLama CPP in a Child Process #481

Uh oh!

Uh oh!

SpicyMelonYT Jul 14, 2025

My Goal:

Replies: 1 comment · 12 replies

Uh oh!

giladgd Jul 14, 2025 Maintainer

Uh oh!

Uh oh!

giladgd Jul 15, 2025 Maintainer

Uh oh!

SpicyMelonYT Jul 15, 2025 Author

Llama3.1 8b Q4_0

Phi 2 Q8_0

NPCAgent (My Fine Tuned model off of llama3.1:8b)

Uh oh!

Uh oh!

SpicyMelonYT Jul 15, 2025 Author

Uh oh!

giladgd Jul 16, 2025 Maintainer

Uh oh!

SpicyMelonYT Jul 16, 2025 Author

DISABLED

ENABLED

SpicyMelonYT
Jul 14, 2025

Replies: 1 comment 12 replies

giladgd
Jul 14, 2025
Maintainer

giladgd Jul 15, 2025
Maintainer

SpicyMelonYT Jul 15, 2025
Author

SpicyMelonYT Jul 15, 2025
Author

giladgd Jul 16, 2025
Maintainer

SpicyMelonYT Jul 16, 2025
Author