-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Hi @wgcban
Thank you for your paper and code for AdaMAE.
In this line a Multinomial distribution is used for sampling the indices for the visible tokens given the probability p_x
. Could you please explain if this operation would be differentiable during back-propagation?
From what I understand REINFORCE is applied in this part (from Line 71 to Line 80). Is there any connection between sampling from a Categorical distribution in this part and the one from Multinomial distribution above? I am a bit confused. Could you please clarify?
Metadata
Metadata
Assignees
Labels
No labels