Skip to content

Triton Model Navigator v0.14.0

Latest
Compare
Choose a tag to compare
@kacper-kleczewski kacper-kleczewski released this 22 Apr 16:54
  • Updates:
    • new: TensorRT INT8 and FP8 quantization through ModelOpt (ONNX path)
    • new: TensorRT NVFP4 quantization through ModelOpt (Torch path)
    • new: Improved TorchCompile performance for repeated compilations using TORCHINDUCTOR_CACHE_DIR environment variable
    • new: Global context with scoped variables - temporary context variables
    • new: Added new context variables INPLACE_OPTIMIZE_WORKSPACE_CONTEXT_KEY and INPLACE_OPTIMIZE_MODULE_GRAPH_ID_CONTEXT_KEY
    • new: nav.bundle.save now has include and exclude patterns for fine grained files selection
    • new: GPU and Host memory usage logging
    • change: Install the TensorRT package for architectures other than x86_64
    • change: Disable conversion fallback for TensorRT paths and expose control option in custom config
    • change: Use torch.export.save for Torch-TRT model serialization
    • change: Added export_engine to OnnxConfig for improved export control
    • fix: Correctness command relative tolerance formula
    • fix: Memory management during export and conversion process for Torch