Cannot load decoder.lm_head.weight when loading 4 bit quantized model using VisionEncoderDecoder.from_pretrained

### System Info

I am trying to load a finetuned and quantized to 4bit Donut model. While save_pretrained works fine, when I try to load the quantized model (at quant_path) as 
model = VisionEncoderDecoderModel.from_pretrained(quant_path, load_in_4bit = True), it loads all of the parameters correct except decoder.lm_head.weight, which is instead reset. I am unable to find the cause of this issue, and it happens both when 1. I load the quantized model, or 2. when I load the finetuned checkpoint with load_in_4bit argument. 


I have tried the same steps with 'naver-clova-ix/donut-base' model from huggingface and it works fine. Any help would be much appreciated!


### Reproduction

from transformers import VisionEncoderDecoderModel
finetuned_model.save_pretrained(finetuned_path, safe_serialization = False) #Safe_serialisation = True discards lm_head.weight
model = VisionEncoderDecoderModel.from_pretrained(finetuned_path, load_in_4bit=True)

### Expected behavior

The model is loaded with the decoder.lm_head.weight from the finetuned checkpoint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cannot load decoder.lm_head.weight when loading 4 bit quantized model using VisionEncoderDecoder.from_pretrained #1343

System Info

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cannot load decoder.lm_head.weight when loading 4 bit quantized model using VisionEncoderDecoder.from_pretrained #1343

Description

System Info

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions