Releases: onnx/onnx-mlir
Releases · onnx/onnx-mlir
v0.5.0.0
ONNX-MLIR v0.5.0.0 is now available with exciting new features. We thank everyone who contributed to this release!
Please visit onnx-mlir to learn more about ONNX-MLIR.
Key Updates
- ONNX 1.17.0
- PyBind 2.12.0
- Benchmark 1.8.4
- IBM z17 NNPA Telum II Support Enabled
What's Changed
- Add a python script for generating text using huggingface gpt2 by @tungld in #2983
- Remove a spike of memory usage in ScrubDisposablePass. by @imaihal in #2978
- RunONNXModel.py: Add a
--cache-model=path
option by @AlexandreEichenberger in #2984 - Enable check-onnx-backend-numerical-nnpa on Jenkins s390x by @tungld in #2985
- RunONNXModel.py: save compilation info into a file when using --save-model or --cache-model by @tungld in #2994
- Fix wrong total number of phases for EmitObj and EmitJNI by @tungld in #2995
- run_gpt2_from_huggingface.py: do not download the onnx data file if it exists by @tungld in #2996
- Opening binary constants files fix on zOS by @christopherlmunoz in #2991
- [NNPA] Memory reduction of stickified constant by stickifying at file writing by @imaihal in #2917
- Option to not emit the full MLIR (only emit .tmp file) by @imaihal in #2997
- RunONNXModel.py: allow to change the default model name by @tungld in #2999
- upgrade to ONNX 1.17.0 (opset 22) by @gongsu832 in #3004
- Add decomposition for
ONNXSoftmaxCrossEntropyLossOp
by @srcarroll in #2968 - Delay scrubbing disposable elements attrs as long as possible by @tungld in #3006
- Add limitation for BFLOAT supported ops for NNPA by @Sunny-Anand in #3008
- Test the return value of omMMapBinaryFile function and terminate the main program elegantly by @tungld in #3002
- Fix a wrong function call by @tungld in #3012
- Making runtime omunreachable static to support clang compiler by @christopherlmunoz in #3015
- Fix security vulenrabilities by @Sunny-Anand in #3019
- Do not fuse locations when normalizing constants for Add and Mul by @jorickert in #3016
- Handle full reduction over all dimensions by @tungld in #3022
- Use DisposableElementsAttr for ZHigh constant propagation by @tungld in #3013
- Re-enable diagnostic error/warning printing by @AlexandreEichenberger in #3020
- Transform SequenceAt to split for special cases by @chentong319 in #3018
- Add tolerance args to CheckONNXModel.py by @AlexandreEichenberger in #3024
- Return a failure instead of crashing if shape inference can not be run because of unraked operand types by @jorickert in #3023
- upgrade becnhmark by @Sunny-Anand in #3027
- Update llvm-project to llvm/llvm-project@01d233ff403823389f848 by @hamptonm1 in #3011
- Update llvm-project to llvm/llvm-project@af20aff35ec3 by @hamptonm1 in #3032
- Fix biasScaleShape of GroupNormalizationV21 to support ranks > 4 by @jorickert in #3030
- Merge from repo by @AlexandreEichenberger in #3033
- Update llvm-project to llvm/llvm-project@e86910337f98 by @hamptonm1 in #3037
- Best practice by @AlexandreEichenberger in #3039
- [NNPA] Fix some bugs for ReduceMin/Max by @tungld in #3038
- Skip over uninitialized DenseResourceAttrs in verifiers by @jorickert in #3041
- [NNPA] Revise compiler options for quantization by @tungld in #3043
- Update the instruction for building multiple accelerators by @tungld in #3046
- Add a document for quantization on NNPA by @tungld in #3045
- update onnx opset by @Sunny-Anand in #3050
- Remove element type restriction in softmax lowering by @srcarroll in #3051
- Fix ASAN/UBSAN issues in DimAnalysis by @jorickert in #3052
- Build light weight PyRuntime without llvm or onnx-mlir by @chentong319 in #3044
- Option to set the number of threads for parallel compilation by @imaihal in #3048
- Update onnx requirement to 1.17.0 by @jorickert in #3054
- Optimization for Roberta unstick->reshape->transpose->reshape->stick by @AlexandreEichenberger in #3056
- Extend GridSample support by @jorickert in #3060
- Remove the pattern unstick_4ds_squeeze_stick_3ds by @tungld in #3062
- Instrumentation cleanup when operation was removed by @AlexandreEichenberger in #3061
- Add support for ONNX.shape with permutation pattern by @AlexandreEichenberger in #3066
- Update docker image to point to github registry in devcontainer-example by @jorickert in #3055
- Parallelization of ConstProp compilation by @imaihal in #3042
- Bump various ops to opset 22, adding bf16 support by @jorickert in #3059
- Bump onnx.Cast to opset 21 , adding int/uint4 support by @jorickert in #3057
- Add runtime check for Gather Op by @chentong319 in #3069
- fix weak hash by @Sunny-Anand in #3070
- Remove the compile option -nnpa-clip-to-dlfloat-range by @tungld in #3075
- Matmul CPU performance regression by @AlexandreEichenberger in #3072
- ZHigh to ONNX optimization is default on. Switch flag from enable to disable by @AlexandreEichenberger in #3074
- Since compiler generated stick/unstick is default on, change new option to disable it by @AlexandreEichenberger in #3073
- Add lit tests for KrnlMatmulOp lowering (Krnl to affine) by @AlexandreEichenberger in #3076
- Upgrading llvm and stablehlo hash by @christopherlmunoz in #3053
- Don't try to free static array in mnist example by @Zentrik in #3049
- Handle out-of-bound value for Gather alike operation by @chentong319 in #3077
- Extend instrumentSignature to print data by @chentong319 in #3078
- Modifying RunONNXModel.py to better support external performance profiling tools by @AlexandreEichenberger in #3082
- Add support to use either docker or local compiler to compile a model by @chentong319 in #3081
- Use docker and podman package in python driver by @chentong319 in #3087
- update pybind11 to version 2.12.0 by @chentong319 in #3088
- Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. by @jorickert in #3065
- Improve scripts by @AlexandreEichenberger in #3089
- Add result type inference to RandomNormalLike and fix wrong hardcodings for dtypes by @jorickert in #3091
- Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. by @jorickert in #3064
- Added minimal support to do some timing of OM Runtime functionality by @AlexandreEichenberger in #3095
- Including __errno_location call for MVS by @christopherlmunoz in #3099
- Rewriting pattern to remove WhereOp and EqualOp. by @imaihal in #3094
- Enable NNPA saturation by default and change the option to --nnpa-disable-saturation by @tungld in #3101
- removing weak attribute of errorno by @christopherlmunoz in #3103
- Fix the custom build link for docs/Docker.md by @qjivy in #3104
- Python driver for torch model by @chentong319 in #3093
- Cherry pick updates from main for z17 and fix for ZHighConstantPropagation in QunarizedStick by @Sunny-Anand in #3133
- [cherry-pick]fix CVE-2025-32434 (#3135) by @Sunny-Anand in #3...
v0.4.3.0
Release for ONNX-MLIR
Full Changelog:v0.4.2.0...v0.4.3.0
v0.4.2.0
Release for ONNX-MLIR
Full Changelog:v0.4.1.2...v0.4.2.0
v0.4.1.2
Bugfix update for ONNX-MLIR
- Fixes performance regression
Full Changelog: v0.4.1.1...v0.4.1.2
v0.4.1.1
Bugfix update for ONNX-MLIR
Full Changelog: v0.4.1...v0.4.1.1
v0.4.1
Bugfix update for ONNX-MLIR
Full Changelog: v0.4.0.1...v0.4.1
v0.4.0.1
Bugfix update for ONNX-MLIR
Full Changelog: v0.4.0...v0.4.0.1
v0.4.0
Release for ONNX-MLIR
Full Changelog: v0.3.2...v0.4.0
v0.3.2
Release for ONNX-MLIR
Full Changelog: v0.3.0...v0.3.2
v0.3.1
Release for ONNX-MLIR
Full Changelog: v0.3.0...v0.3.1