Skip to content

rfcs: extend layer norm to support root mean square normalization #3147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 7, 2025

Conversation

mzhukova
Copy link
Contributor

@mzhukova mzhukova commented Apr 23, 2025

Proposal to extend current Layer Normalization primitive to support RMS normalization via new flag.

Read here: https://github.com/uxlfoundation/oneDNN/blob/rfcs/rfcs/20251004-rms-norm/README.md
Implementation: #3068
JIRA: https://jira.devtools.intel.com/browse/MFDNN-13287

// Note: Current proposal is aligned with keras RMS normalization, but not Layer Normalization with rms_scaling due to keras-team/keras#21234.

@mzhukova mzhukova self-assigned this Apr 23, 2025
@github-actions github-actions bot added the RFC A design document label Apr 23, 2025
@mzhukova mzhukova requested a review from a team April 23, 2025 19:58
@mzhukova
Copy link
Contributor Author

also inviting @pengzhao-intel and @gaurides to take a look

@mzhukova
Copy link
Contributor Author

confirmed with @pengzhao-intel and @gaurides, that this options is preferred --

(Preferred) Omit mean in the output:
Pros: Cleaner design and avoids unnecessary memory requirement.
Cons: Requires users to adjust their code to handle the absence of the mean for RMSNorm comparing to LayerNorm.

@mzhukova mzhukova requested a review from dzarukin April 26, 2025 00:20
Copy link
Contributor

@mgouicem mgouicem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding open questions:

  • I would vote for dnnl_use_zero_mean or dnnl_no_mean, as this is closer to how we split dnnl_use_scale and dnnl_use_shift.
  • I would vote for omitting mean, but it might be worth checking with PyTorch team what the expectation is.

@mzhukova mzhukova requested a review from mgouicem May 2, 2025 00:36
@mzhukova mzhukova force-pushed the mzhukova/rfcs/20251004-rms-norm branch from 791544a to 4fb9ee2 Compare May 2, 2025 21:20
@mzhukova mzhukova requested a review from gaurides May 2, 2025 21:30
@mzhukova
Copy link
Contributor Author

mzhukova commented May 2, 2025

hi @dzarukin, @mgouicem and @gaurides, thank you for your feedback!
I believe I've addressed all the current comments, so could you please re-review/close opened threads and let me know if anything else is needed here or approve.

Copy link

@gaurides gaurides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mzhukova mzhukova requested review from dzarukin and removed request for dzarukin May 7, 2025 00:20
@mzhukova mzhukova merged commit e1d1e4b into rfcs May 7, 2025
1 check passed
@mzhukova mzhukova deleted the mzhukova/rfcs/20251004-rms-norm branch May 7, 2025 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC A design document
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants