-
Notifications
You must be signed in to change notification settings - Fork 0
Add: Sparse Finetuning Integration for Axolotl
#2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
14757d1
to
4438e5d
Compare
Axolotl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome! really nice and compact. Should we get feedback from MLR team? Maybe a demo in a call?
Input arguments for Sparse Finetuning. | ||
""" | ||
|
||
recipe: Optional[Any] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the note above in the example for namespacing this and setting up proper types. Along those lines, we should set expectations for the type of the recipe, specifically that we support a dictionary arg passed in or a string representing the path to a file or model stub
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we only support specifying full recipes, can make that update if needed in a separate diff
4438e5d
to
16ccf21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor things, but otherwise looks good to me
LOG = logging.getLogger("axolotl.integrations.llmcompressor_sft") | ||
|
||
|
||
class SFTCallbackHandler(TrainerCallback): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this to something like CompressorCallbackHandler? I don't want to hardcode this on sparsity since the implementation is more generic and ideally will enable other pathways rather than just sparsity in the near future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Co-authored-by: Mark Kurtz <mark.j.kurtz@gmail.com>
@markurtz I've addressed all your changes in commits: |
This PR introduces sparse fine-tuning support in Axolotl using LLMCompressor as a plugin. This integration allows users to efficiently fine-tune models with structured/unstructured sparsity.
Key Changes
src/axolotl/integrations/llmcompressor_sft/__init__.py
for integrating sparse fine-tuning.src/axolotl/integrations/llmcompressor_sft/args.py
to configure sparse training.examples/llama-3/sft.yaml
to showcase sparse fine-tuning on LLaMA-3 models. (Note: Right now a test is running with SparseLlama 8B 2:4 on gsmk8 will update example and lm_eval results here once complete)src/axolotl/utils/models.py
for sparse model handling.src/axolotl/integrations/llmcompressor_sft/args.py
and__init__.py
.Installation
Expand for Installation Instructions
Training Command
To train with sparse fine-tuning, use:
Test Run Output
Expand to View Output Logs
Sparsity Verification
To verify 2:4 structured sparsity is maintained after finetuning
Script
Command:
python3 check_safetensors.py --file_path /path/to/model.safetensors --regex ".*layers.*[qkv]_proj.*" --check_sparsity
Expand for Sample Output
Running in vLLM
To load the fine-tuned sparse model in vLLM:
Script
Expand for Inference Output
Flow diagram
Order of callbacks: