Replies: 2 comments
-
Beta Was this translation helpful? Give feedback.
0 replies
-
抱歉给您带来的不便,之前开源的代码中确实存在 bug,目前已经尝试修复,具体的性能测试还在跟进 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Description / 描述
使用在其他模型上训练成功的数据集,用sft_finetune.sh做全量微调后,模型结果很奇怪。出来一大串空格+一个句号。
使用hf上的代码,载入原始模型(MiniCPM-2B-sft-bf16)运行一致,载入微调模型后输出:
注1:deepspeed用的是stage_2非offload,这点跟给的原始脚本不太一样,不知道有没有影响
注2:已经将每条训练数据长度限制在512个tokens内。
Case Explaination / 案例解释
No response
Beta Was this translation helpful? Give feedback.
All reactions