-
I'm finding that if I follow recent papers, my imatrices are quite large in a multilingual model and it can take ~16hrs just to generate an imatrix for the 2b model I am working on. I have a home lab with two machines minis (older M1's with 16GB of ram) and my main laptop M3 Max MacBook Pro, and I would like to use them together. I know I can split the data and run each separately, but it is hard to eyeball what the split should look like (how large each dataset should be). Is there any guidance on how to best split up the task? Is this something that could be fairly easily automated in the llama-imatrix command? It feels like it, since I believe I can literally just |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
In fact currently you can only just wait until the first chunk complete so you'll have an estimate, you can use this to fiddle with the batch size and micro batch size, and watch resource draw. This section of the code I understand is being revised (see other post that was just answered on related topic). |
Beta Was this translation helpful? Give feedback.
In fact currently you can only just wait until the first chunk complete so you'll have an estimate, you can use this to fiddle with the batch size and micro batch size, and watch resource draw.
This section of the code I understand is being revised (see other post that was just answered on related topic).