-
Notifications
You must be signed in to change notification settings - Fork 1k
rfcs: extend layer norm to support root mean square normalization #3147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
also inviting @pengzhao-intel and @gaurides to take a look |
confirmed with @pengzhao-intel and @gaurides, that this options is preferred --
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding open questions:
- I would vote for
dnnl_use_zero_mean
ordnnl_no_mean
, as this is closer to how we splitdnnl_use_scale
anddnnl_use_shift
. - I would vote for omitting mean, but it might be worth checking with PyTorch team what the expectation is.
791544a
to
4fb9ee2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Proposal to extend current Layer Normalization primitive to support RMS normalization via new flag.
Read here: https://github.com/uxlfoundation/oneDNN/blob/rfcs/rfcs/20251004-rms-norm/README.md
Implementation: #3068
JIRA: https://jira.devtools.intel.com/browse/MFDNN-13287
// Note: Current proposal is aligned with keras RMS normalization, but not Layer Normalization with rms_scaling due to keras-team/keras#21234.