If the chatgpt rlhf trainning example Support shardint +tp + lora? #3303
Unanswered
taishiciR
asked this question in
Community | Q&A
Replies: 1 comment 1 reply
-
@taishiciR TP is currently unnecessary for most cases. You can use lora+zero+gemini for large models. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
When turning shardinit on in the rlhf step of chatgpt example, I came across runtime error like "mat1 and mat2 shapes cannot be multiplied (768x2048 and 512x16)" in the actor`s generate function.
So I wonder:
If [shardint +tp] just conflicts with [lora] by now ?
Or maybe I am using it in a wrong way?
│ /usr/local/python3.9.16/lib/python3.9/site-packages/colossalai/tensor/colo_tensor.py:184 in │
│ torch_function │
│ │
│ 181 │ │ │ │ return backward_tensor.backward(**tensor_kwargs) │
│ 182 │ │ │
│ 183 │ │ with torch._C.DisableTorchFunction(): │
│ ❱ 184 │ │ │ ret = func(*args, **kwargs) │
│ 185 │ │ │ if func in _get_my_nowrap_functions(): │
│ 186 │ │ │ │ return ret │
│ 187 │ │ │ else: │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: mat1 and mat2 shapes cannot be multiplied (768x2048 and 512x16)
Error message came from chatgpt/models/lora.py line 89
"result = result + (self.lora_dropout(x) @ self.lora_A.t() @ self.lora_B.t()) * self.scaling"
when multipling self.lora_dropout(x) and self.lora_A.t() , torch.Size([8, 96, 2048]) , torch.Size([512, 16])
By the way, I observed that :
If turn on shard init ,lora_A param is split by world_size e.g: hidden_size/world_size:
decoder.layers.0.self_attn.k_proj.lora_A
torch.Size([16, 512])
If turn off shard init, lora_A param is of hidden_size
decoder.layers.0.self_attn.k_proj.lora_A
torch.Size([16, 2048])
@ht-zhou
Beta Was this translation helpful? Give feedback.
All reactions