feat: fp32 lm_head and fp32 apply_rope options for MoE #769

hemildesai · 2025-11-05T01:40:17Z

No description provided.

akoumpa · 2025-11-05T03:27:47Z

nemo_automodel/components/moe/parallelizer.py

    if lm_head is not None:
-        fully_shard_default(lm_head)
+        # Use custom mixed precision policy for lm_head if lm_head_precision is specified
+        if lm_head_precision == torch.float32:


is it possible to inspect the lm_head to figure out the precision?

This option is to force lm head in fp32 regardless of checkpoint dtype. fp32 lm_head helps with RL stability.

Signed-off-by: Hemil Desai <hemild@nvidia.com>

copy-pr-bot · 2025-11-05T21:32:26Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

adil-a

LGTM

adil-a · 2025-11-06T17:01:29Z

/ok to test 2e79028

hemildesai requested review from HuiyingLi, adil-a and akoumpa as code owners November 5, 2025 01:40

copy-pr-bot bot temporarily deployed to nemo-ci November 5, 2025 01:40 Inactive

copy-pr-bot bot temporarily deployed to test November 5, 2025 01:40 Inactive

akoumpa reviewed Nov 5, 2025

View reviewed changes

copy-pr-bot bot temporarily deployed to nemo-ci November 5, 2025 03:49 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 5, 2025 04:02 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci November 5, 2025 04:02 Failure

copy-pr-bot bot temporarily deployed to nemo-ci November 5, 2025 04:02 Inactive

hemildesai added 2 commits November 5, 2025 13:32

feat: fp32 lm_head and fp32 apply_rope options for MoE

7708092

Signed-off-by: Hemil Desai <hemild@nvidia.com>

fix

945039c

Signed-off-by: Hemil Desai <hemild@nvidia.com>

hemildesai force-pushed the hemil/fp32-lmhead-rope branch from 3579886 to 945039c Compare November 5, 2025 21:32

adil-a approved these changes Nov 6, 2025

View reviewed changes

Merge branch 'main' into hemil/fp32-lmhead-rope

2e79028

adil-a enabled auto-merge (squash) November 6, 2025 17:01

copy-pr-bot bot temporarily deployed to nemo-ci November 6, 2025 17:01 Inactive

copy-pr-bot bot temporarily deployed to test November 6, 2025 17:01 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 6, 2025 17:35 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 6, 2025 17:54 Inactive

adil-a merged commit 8316227 into main Nov 6, 2025
51 checks passed

adil-a deleted the hemil/fp32-lmhead-rope branch November 6, 2025 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: fp32 lm_head and fp32 apply_rope options for MoE #769

feat: fp32 lm_head and fp32 apply_rope options for MoE #769

hemildesai commented Nov 5, 2025

Uh oh!

akoumpa Nov 5, 2025

Uh oh!

hemildesai Nov 5, 2025

Uh oh!

copy-pr-bot bot commented Nov 5, 2025

Uh oh!

adil-a left a comment

Uh oh!

adil-a commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: fp32 lm_head and fp32 apply_rope options for MoE #769

feat: fp32 lm_head and fp32 apply_rope options for MoE #769

Conversation

hemildesai commented Nov 5, 2025

Uh oh!

akoumpa Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

hemildesai Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

copy-pr-bot bot commented Nov 5, 2025

Uh oh!

adil-a left a comment

Choose a reason for hiding this comment

Uh oh!

adil-a commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants