Making quants by using the GPU #5087

Nexesenex · 2024-01-22T22:56:41Z

Nexesenex
Jan 22, 2024

Is it already, or would it be reasonably possible to use the compute capabilities of our GPUs to quantize models with Llama.CPP?
This considering notably the additional calculation brought by the new iMatrix feature when making guided quants, and notably the IQ ones (which are out of my i7-6700k league for quantizing 70b models in a reasonable time).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Making quants by using the GPU #5087

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Making quants by using the GPU #5087

Uh oh!

Uh oh!

Nexesenex Jan 22, 2024

Replies: 0 comments

Nexesenex
Jan 22, 2024