No grad on visual LoRA layers

Hi @2U1 , congrats for you great work and thanks for your huge effort actively maintaining this repo.

I have this problem and can't figure out what's happening:
I am finetuning Qwen-2.5-VL-7B with DPO on both image-video data, modifying the DPO training script in order to support also chosen-rej pairs on inputs rather than chosen-rej pairs on answers.
Btw:
If I finetune with `VE frozen, LLM frozen, Merger trainable`, everything looks fine.
If I finetune with `VE frozen, LLM frozen, Merger trainable, lora_enabled (only on LLM)`, everything looks fine.
If I finetune with `VE frozen, LLM frozen, Merger trainable, lora_enabled, vision_lora (on both VE and LLM)`: after the .backward() if I take `self.model.base_model.model.model.language_model.layers[0].self_attn.q_proj.lora_B.default.weight.grad` it is there, and correctly updating during steps. But if I take `self.model.base_model.model.model.visual.blocks[0].attn.qkv.lora_B.default.weight.grad`, result is None (not just Zero on values, it literally does not have grad). And this is true for every single LoRA param on the Vision encoder.

I debugged and optimizer correctly contains every lora param, also all loras has `requires_grad = True`.

Even if I disable lora on LLM and keep it only on VE (which is actually what I would like to do...) same problem! I don't understand what's happening here... I am debugging without deepspeed enabled.

GPUs: A100-SXM4-40GB
python: 3.11.12
torch: 2.8.0+cu128
torchvision: 0.23.0+cu128
accelerate: 1.10.1
peft: 0.15.2
transformers: 4.56.1

Thanks in advance :)
Great code!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No grad on visual LoRA layers #193

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

No grad on visual LoRA layers #193

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions