Skip to content

NUS-HPC-AI-Lab/MERIT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MERIT Optimizer

ICML 2025 | Paper

This is an official implementation of MERIT optimizer in the "MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training". Please cite the paper and star this repo if you find MERIT useful. Thanks!

Method

MERIT is a novel optimizer that leverages the max-norm to calculate the trust ratio, effectively constraining the maximum attention logit. Furthermore, MERIT constructs element-wise trust ratios to enable more robust update scaling by focusing on local weight structures.

CAME optimizer pseudo code

Visualization

Compared to LAMB and AdamW, MERIT better controls the maximum attention logit.

Figure 1 Figure 2

Usage

from optim.merit import MERIT
optimizer = MERIT(
    model.parameters(),
    lr=2e-4,
    weight_decay=1e-2,
    betas=(0.9, 0.95),
)

Citation

@inproceedings{luo2025merit,
title={{MERIT}: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training},
author={Yang Luo and Zangwei Zheng and Ziheng Qin and Zirui Zhu and Yong Liu and Yang You},
booktitle={Forty-second International Conference on Machine Learning},
year={2025},
url={https://openreview.net/forum?id=NSxKNNFni0}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages