VocabParallelClassifier1D中设置是否有问题? #2256
Unanswered
yhcc
asked this question in
Community | Q&A
Replies: 1 comment
-
非常感谢您的反馈。应该是确实存在这个问题,gather_output本来主要是用在模型在transformer之后是连续两个linear去输出logits时(比如bert)的倒数第二个linear层,保证它的输出的tensor是不并行的(本身1D的话此处的column linear必然会输出并行的tensor)。我们会尽快优化一下相关接口的易读性,也欢迎您提出建议,谢谢。 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
我发现在VocabParallelClassifier1D中会设置
ColossalAI/colossalai/nn/layer/parallel_1d/layers.py
Line 345 in c8c7910
同时计算loss的时候,貌似会用到这个环境变量
ColossalAI/colossalai/nn/loss/__init__.py
Line 32 in c8c7910
但是VocabParallelClassifier1D有一个这个参数
ColossalAI/colossalai/nn/layer/parallel_1d/layers.py
Line 316 in c8c7910
Beta Was this translation helpful? Give feedback.
All reactions