Release 2025.1.0 · openvinotoolkit/openvino

Summary of major features and improvements  

More GenAI coverage and framework integrations to minimize code changes
- New models supported: Phi-4 Mini, Jina CLIP v1, and Bce Embedding Base v1.
- OpenVINO™ Model Server now supports VLM models, including Qwen2-VL, Phi-3.5-Vision, and InternVL2.
- OpenVINO GenAI now includes image-to-image and inpainting features for transformer-based pipelines, such as Flux.1 and Stable Diffusion 3 models, enhancing their ability to generate more realistic content.
- Preview: AI Playground now utilizes the OpenVINO Gen AI backend to enable highly optimized inferencing performance on AI PCs.
Broader LLM model support and more model compression techniques
- Reduced binary size through optimization of the CPU plugin and removal of the GEMM kernel.
- Optimization of new kernels for the GPU plugin significantly boosts the performance of Long Short-Term Memory (LSTM) models, used in many applications, including speech recognition, language modeling, and time series forecasting.
- Preview: Token Eviction implemented in OpenVINO GenAI to reduce the memory consumption of KV Cache by eliminating unimportant tokens. This current Token Eviction implementation is beneficial for tasks where a long sequence is generated, such as chatbots and code generation.
- NPU acceleration for text generation is now enabled in OpenVINO™ Runtime and OpenVINO™ Model Server to support the power-efficient deployment of VLM models on NPUs for AI PC use cases with low concurrency.
More portability and performance to run AI at the edge, in the cloud, or locally.
- Support for the latest Intel® Core™ processors (Series 2, formerly codenamed Bartlett Lake), Intel® Core™ 3 Processor N-series and Intel® Processor N-series (formerly codenamed Twin Lake) on Windows.
- Additional LLM performance optimizations on Intel® Core™ Ultra 200H series processors for improved 2nd token latency on Windows and Linux.
- Enhanced performance and efficient resource utilization with the implementation of Paged Attention and Continuous Batching by default in the GPU plugin.
- Preview: The new OpenVINO backend for Executorch will enable accelerated inference and improved performance on Intel hardware, including CPUs, GPUs, and NPUs.

Support Change and Deprecation Notices

Discontinued in 2025:
- Runtime components:
  - The OpenVINO property of Affinity API is no longer available. It has been replaced with CPU binding configurations (ov::hint::enable_cpu_pinning).
- Tools:
  - The OpenVINO™ Development Tools package (pip install openvino-dev) is no longer available for OpenVINO releases in 2025.
  - Model Optimizer is no longer available. Consider using the new conversion methods instead. For more details, see the model conversion transition guide.
  - Intel® Streaming SIMD Extensions (Intel® SSE) are currently not enabled in the binary package by default. They are still supported in the source code form.
  - Legacy prefixes: l_, w_, and m_ have been removed from OpenVINO archive names.
- OpenVINO GenAI:
  - StreamerBase::put(int64_t token)
  - The Bool value for Callback streamer is no longer accepted. It must now return one of three values of StreamingStatus enum.
  - ChunkStreamerBase is deprecated. Use StreamerBase instead.
- NNCF create_compressed_model() method is now deprecated. nncf.quantize() method is recommended for Quantization-Aware Training of PyTorch and TensorFlow models.
- OpenVINO Model Server (OVMS) benchmark client in C++ using TensorFlow Serving API.
Deprecated and to be removed in the future:
- openvino.Type.undefined is now deprecated and will be removed with version 2026.0. openvino.Type.dynamic should be used instead.
- APT & YUM Repositories Restructure: Starting with release 2025.1, users can switch to the new repository structure for APT and YUM, which no longer uses year-based subdirectories (like “2025”). The old (legacy) structure will still be available until 2026, when the change will be finalized. Detailed instructions are available on the relevant documentation pages:
  - Installation guide - yum
  - Installation guide - apt
- OpenCV binaries will be removed from Docker images in 2026.
- Ubuntu 20.04 support will be deprecated in future OpenVINO releases due to the end of standard support.
- “auto shape” and “auto batch size” (reshaping a model in runtime) will be removed in the future. OpenVINO’s dynamic shape models are recommended instead.
- MacOS x86 is no longer recommended for use due to the discontinuation of validation. Full support will be removed later in 2025.
- The openvino namespace of the OpenVINO Python API has been redesigned, removing the nested openvino.runtime module. The old namespace is now considered deprecated and will be discontinued in 2026.0.

You can find OpenVINO™ toolkit 2025.1 release here:

Download archives* with OpenVINO™
Install it via Conda: conda install -c conda-forge openvino=2025.1.0
OpenVINO™ for Python: pip install openvino==2025.1.0

Acknowledgements

Thanks for contributions from the OpenVINO developer community:
@11happy
@arkhamHack
@AsVoider
@chiruu12
@darshil929
@geeky33
@itsbharatj
@jpy794
@kuanxian1
@Mohamed-Ashraf273
@nikolasavic3
@oToToT
@SaifMohammed22
@srinjoydutta03

Release documentation is available here: https://docs.openvino.ai/2025
Release Notes are available here: https://docs.openvino.ai/2025/about-openvino/release-notes-openvino.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2025.1.0

Summary of major features and improvements

More GenAI coverage and framework integrations to minimize code changes

Broader LLM model support and more model compression techniques

More portability and performance to run AI at the edge, in the cloud, or locally.

Support Change and Deprecation Notices

Runtime components:

Tools:

OpenVINO GenAI:

Contributors

Uh oh!