Skip to content

Upcoming Release Roadmap

Faith Xu edited this page Oct 16, 2023 · 8 revisions

ORT 1.17

Target Release: Early December 2023

General

  • ONNX 1.15 support
  • Optimize inlining of graphs to better support TorchDynamo exported models
  • Add support for supplying a custom logger at the session level

Builds and packages

  • Update C/C++ libs: abseil, date, nsync, googletest, wil, mp11, cpuinfo, safeint, onnx, re2
  • Drop support for CentOS 7 and update manylinux tag from manylinux2014 to manylinux2_28
  • Official AMD build package with ROCm and MiGraphX EPs (Python + Linux only)
  • CUDA 12 official package support
  • Python 3.12 support (targeted)

Performance

  • 4bit quant support on Nvidia GPU and ARM64

Execution Providers

  • TensorRT EP
    • Shape profile for multi thread
    • Stream sync between CUDA EP and TensorRT EP fixes
  • QNN EP
    • QNN 2.16 support
    • Context binary caching and model initialization optimizations
      • Mixed precision (8/16 bit) quantization support
  • OpenVINO EP
    • (TBD)

Mobile

  • Extend CoreML/NNAPI op coverage to support Yolov8
  • Upgrade flatbuffers to support 4GB files
  • Whisper app using AzureEP
  • Enable fp16 on CoreML/NNAPI/XNNPACK?

Web

  • Support for external data format
  • Support for io-bindings
  • Support for training
  • Webgpu optimizations
  • Fp16 support for Webgpu (targeted)
  • Webnn included in official npm package

Training

  • Large Mode Training
    • Optimizations for dynamo exported models
    • Optimizations for recompute optimizer
    • Integration with Falcon
    • Packages for CUDA 12.1
  • On Device Training
    • Enable training on web for federated learning scenarios
Clone this wiki locally