Skip to content

Conversation

@MagellaX
Copy link

@MagellaX MagellaX commented Aug 9, 2025

PR description

Adds first-class support for Microsoft Phi-2 with robust, minimal integration.

What/why

  • Implement FastPhiModel for Phi-2; enable Unsloth fastpaths.
  • Handle partial RoPE correctly (applies rotation to first rotary_dim features, default 0.4 when absent).
  • Provide deterministic, stateless residual dropout (device-agnostic, seedable) for attention/MLP outputs.
  • Wire loader dispatch and add alias mapping for 4-bit loading.

Highlights

  • unsloth/models/phi.py: Phi attention forward (partial RoPE), CausalLM fastpath, post-patch defaults, deterministic dropout attachment.
  • unsloth/models/loader.py: dispatch model_type == "phi"; call model post_patch.
  • unsloth/models/mapper.py: add unsloth/Phi-2-bnb-4bit alias.
  • unsloth/kernels: new deterministic dropout; partial RoPE helpers; safe LayerNorm/GeLU hooks (torch-backed by default).
  • tests/qlora/test_unsloth_qlora_train_and_merge.py: Phi-2 smoke test (loads and runs forward; skips if weights unavailable).

Compatibility

  • No behavior changes to non-Phi models.
  • Kernel hooks are optional and torch-backed by default.

Links

MagellaX and others added 2 commits August 10, 2025 01:41
Copy link
Collaborator

@Datta0 Datta0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contributions. A few minor comments
Please remove the unnecessary pycache files .

@MagellaX
Copy link
Author

Hey @danielhanchen any thoughts bro?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants