Skip to content

Lora Merge For CLS Model #208

@ZhanliangAaronWang

Description

@ZhanliangAaronWang

Problem Description

I'm experiencing a significant accuracy drop when loading a LoRA adapter for inference. During training evaluation, the model achieves ~70% accuracy, but when I load the saved LoRA weights for inference, accuracy drops to ~2% (essentially random predictions). But inference with full fine-tuned model (non-LoRA): Works correctly ✓

LoRA Configuration

{
  "base_model_name_or_path": "Qwen/Qwen2.5-VL-7B-Instruct",
  "bias": "none",
  "inference_mode": true,
  "lora_alpha": 64,
  "lora_dropout": 0.05,
  "peft_type": "LORA",
  "r": 64,
  "target_modules": [
    "score",  // Classification head
    "q_proj", "v_proj", "k_proj", "o_proj",
    // ... other MLP layers
  ],
  "task_type": "CAUSAL_LM"
}

Inference Code Snippet

# Loading base model
model = Qwen2_5_VLForSequenceClassification.from_pretrained(
    base_model_path,
    config=config,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)

# Loading LoRA adapter
from peft import PeftModel
model = PeftModel.from_pretrained(model, lora_checkpoint_path)
model = model.merge_and_unload()  # Also tried without merging
model.eval()

# Inference results in ~2% accuracy

What I've Tried

  1. ✓ Verified that adapter_model.bin and adapter_config.json are saved correctly
  2. ✓ Confirmed that non_lora_state_dict.bin contains the classification head weights
  3. ✓ Tried both with and without merge_and_unload()
  4. ✓ Verified the base model path is correct
  5. ✓ Checked that CLASS_2_ID mapping is consistent between training and inference

Suspected Issues

  1. Classification head initialization: The classification head might be getting re-initialized during LoRA loading
  2. State dict merging: The non-LoRA parameters (including classification head) might not be loaded correctly
  3. Task type mismatch: Using task_type="CAUSAL_LM" for a classification task

Thanks a lot for your time and help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions