Skip to content

Cannot load decoder.lm_head.weight when loading 4 bit quantized model using VisionEncoderDecoder.from_pretrained #1343

Open
@AditiJain14

Description

@AditiJain14

System Info

I am trying to load a finetuned and quantized to 4bit Donut model. While save_pretrained works fine, when I try to load the quantized model (at quant_path) as
model = VisionEncoderDecoderModel.from_pretrained(quant_path, load_in_4bit = True), it loads all of the parameters correct except decoder.lm_head.weight, which is instead reset. I am unable to find the cause of this issue, and it happens both when 1. I load the quantized model, or 2. when I load the finetuned checkpoint with load_in_4bit argument.

I have tried the same steps with 'naver-clova-ix/donut-base' model from huggingface and it works fine. Any help would be much appreciated!

Reproduction

from transformers import VisionEncoderDecoderModel
finetuned_model.save_pretrained(finetuned_path, safe_serialization = False) #Safe_serialisation = True discards lm_head.weight
model = VisionEncoderDecoderModel.from_pretrained(finetuned_path, load_in_4bit=True)

Expected behavior

The model is loaded with the decoder.lm_head.weight from the finetuned checkpoint

Metadata

Metadata

Assignees

No one assigned

    Labels

    Model SupportRelated to a specific modeling situation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions