Skip to content

复现时遇到的问题 #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
carbonatedbeverages opened this issue Apr 8, 2025 · 0 comments
Open

复现时遇到的问题 #36

carbonatedbeverages opened this issue Apr 8, 2025 · 0 comments

Comments

@carbonatedbeverages
Copy link

Hi,非常感谢您的工作!我正尝试去复现您的代码,但在复现过程中遇到一些问题希望得到您的解答。主要是在超参数的设置上,您在github上提供的默认脚本当中 lr=5e-5,epochs=3,word_freq_lamda=0.3等等,特别是lr似乎与论文当中的3e-6不一致?从论文中“We train DiffusionBERT using the AdamW optimizer for 1.9 million steps with learning rate of 3e-6, dropout probability of 0.1, batch size of 32.”推断您设置的epochs似乎是两轮?
我在lr=3e-6,epoch=1,num_steps=2048的设置下进行了一轮次的训练后采样(topk=30,num_steps=512)出来的结果很不理想,全都是“The . The .. The”这样的情况,这是否是由于训练轮数不够导致模型还没有训练好?抑或是其它一些参数的设置问题?能否告知一下完成一次完整的训练您大概花费了多少时间。
期待您的回复,感谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant