Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order LLM Fine-Tuning

This repository contains the code for experiments applying Jaguar SignSGD, Jaguar Muon and ZO-Muon methods for different LLM Fine-Tuning tasks.

The code is based on the benchmark

Requirements

To install requirements:

pip install -r requirements.txt

To train and evaluate the model in the paper, run this command:

./run_script.sh

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
__pycache__		__pycache__
gradient_pruning		gradient_pruning
modeling_mistral		modeling_mistral
sweeps		sweeps
wandb		wandb
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
lora.py		lora.py
metrics.py		metrics.py
modeling_llama.py		modeling_llama.py
modeling_opt.py		modeling_opt.py
prefix_tuning.py		prefix_tuning.py
prompt_tuning.py		prompt_tuning.py
requierements.txt		requierements.txt
run.py		run.py
run_script.sh		run_script.sh
samplers.py		samplers.py
tasks.py		tasks.py
templates.py		templates.py
test_fake_text_memory.py		test_fake_text_memory.py
trainer.py		trainer.py
utils.py		utils.py