Exported ONNX model files are much larger than expected, compared to the ones created by an older version

### System Info

```shell
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy
$ python -V
Python 3.12.6
$ uv pip freeze
accelerate==1.6.0
certifi==2025.1.31
charset-normalizer==3.4.1
coloredlogs==15.0.1
filelock==3.18.0
flatbuffers==25.2.10
fsspec==2025.3.2
huggingface-hub==0.30.2
humanfriendly==10.0
idna==3.10
jinja2==3.1.6
markupsafe==3.0.2
mpmath==1.3.0
networkx==3.4.2
numpy==2.2.5
nvidia-cublas-cu12==12.6.4.1
nvidia-cuda-cupti-cu12==12.6.80
nvidia-cuda-nvrtc-cu12==12.6.77
nvidia-cuda-runtime-cu12==12.6.77
nvidia-cudnn-cu12==9.5.1.17
nvidia-cufft-cu12==11.3.0.4
nvidia-cufile-cu12==1.11.1.6
nvidia-curand-cu12==10.3.7.77
nvidia-cusolver-cu12==11.7.1.2
nvidia-cusparse-cu12==12.5.4.2
nvidia-cusparselt-cu12==0.6.3
nvidia-nccl-cu12==2.26.2
nvidia-nvjitlink-cu12==12.6.85
nvidia-nvtx-cu12==12.6.77
onnx==1.17.0
onnxruntime==1.20.1
onnxslim==0.1.48
optimum @ git+https://github.com/huggingface/optimum.git@b04feaea78cda58d79b8da67dca3fd0c4ab33435
packaging==25.0
protobuf==6.30.2
psutil==7.0.0
pyyaml==6.0.2
regex==2024.11.6
requests==2.32.3
safetensors==0.5.3
setuptools==79.0.1
sympy==1.13.3
tokenizers==0.21.1
torch==2.7.0
tqdm==4.67.1
transformers==4.49.0
triton==3.3.0
typing-extensions==4.13.2
urllib3==2.4.0
```

### Who can help?

@michaelbenayoun

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction (minimal, reproducible, runnable)

ONNX model files exported from `optimum.exporters.onnx.main_export` is larger than the ones created with the older version which @xenova suggested seems to be an issue around the weight deduplication step.
Reference: https://huggingface.co/Xenova/nllb-200-distilled-600M/discussions/3

I converted https://huggingface.co/facebook/nllb-200-distilled-600M for example.
* `onnx/decoder_model.onnx` that was converted with the older version was 1860454885 bytes (~1.86GB)
* Newly converted `onnx/decoder_model.onnx_data` is 2909290496 bytes (~2.91GB) with `onnx/decoder_model.onnx` in 430168 bytes (430kB).

### Expected behavior

The converted model size should be similar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exported ONNX model files are much larger than expected, compared to the ones created by an older version #2241

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Exported ONNX model files are much larger than expected, compared to the ones created by an older version #2241

Description

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions