-
Notifications
You must be signed in to change notification settings - Fork 567
Open
Labels
bugSomething isn't workingSomething isn't working
Description
System Info
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.5 LTS
Release: 22.04
Codename: jammy
$ python -V
Python 3.12.6
$ uv pip freeze
accelerate==1.6.0
certifi==2025.1.31
charset-normalizer==3.4.1
coloredlogs==15.0.1
filelock==3.18.0
flatbuffers==25.2.10
fsspec==2025.3.2
huggingface-hub==0.30.2
humanfriendly==10.0
idna==3.10
jinja2==3.1.6
markupsafe==3.0.2
mpmath==1.3.0
networkx==3.4.2
numpy==2.2.5
nvidia-cublas-cu12==12.6.4.1
nvidia-cuda-cupti-cu12==12.6.80
nvidia-cuda-nvrtc-cu12==12.6.77
nvidia-cuda-runtime-cu12==12.6.77
nvidia-cudnn-cu12==9.5.1.17
nvidia-cufft-cu12==11.3.0.4
nvidia-cufile-cu12==1.11.1.6
nvidia-curand-cu12==10.3.7.77
nvidia-cusolver-cu12==11.7.1.2
nvidia-cusparse-cu12==12.5.4.2
nvidia-cusparselt-cu12==0.6.3
nvidia-nccl-cu12==2.26.2
nvidia-nvjitlink-cu12==12.6.85
nvidia-nvtx-cu12==12.6.77
onnx==1.17.0
onnxruntime==1.20.1
onnxslim==0.1.48
optimum @ git+https://github.com/huggingface/optimum.git@b04feaea78cda58d79b8da67dca3fd0c4ab33435
packaging==25.0
protobuf==6.30.2
psutil==7.0.0
pyyaml==6.0.2
regex==2024.11.6
requests==2.32.3
safetensors==0.5.3
setuptools==79.0.1
sympy==1.13.3
tokenizers==0.21.1
torch==2.7.0
tqdm==4.67.1
transformers==4.49.0
triton==3.3.0
typing-extensions==4.13.2
urllib3==2.4.0
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction (minimal, reproducible, runnable)
ONNX model files exported from optimum.exporters.onnx.main_export
is larger than the ones created with the older version which @xenova suggested seems to be an issue around the weight deduplication step.
Reference: https://huggingface.co/Xenova/nllb-200-distilled-600M/discussions/3
I converted https://huggingface.co/facebook/nllb-200-distilled-600M for example.
onnx/decoder_model.onnx
that was converted with the older version was 1860454885 bytes (~1.86GB)- Newly converted
onnx/decoder_model.onnx_data
is 2909290496 bytes (~2.91GB) withonnx/decoder_model.onnx
in 430168 bytes (430kB).
Expected behavior
The converted model size should be similar.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working