How to do a simple parameter merge in a memory efficient manner? #17010
Unanswered
aniquetahir
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have parameters of a moderately large model and I wrote a function to do LORA operation i.e.$(W + AB)x$ instead of $Wx$ . where AB are the additional parameters. Now I want to add AB to W. Where AB is added to only some of the parameters (not all).
params
are the original parameters as a nested NamedTupleparams_flat
is just the flattened tree from params`lora_params' are:
Here is my code:
Here is the issue.
merge_lora_params
(which adds W to AB for some parameters) works fine.insert_q_v_params
which adds theW+AB
to runs out of memory.get_merged_params
, which gets W+AB, puts it in a flat parameter list and then unflattens the tree takes so much RAM that the compilation fails.Is there a workaround for this. Everything works without jit compile.
Beta Was this translation helpful? Give feedback.
All reactions