Releases · openvinotoolkit/openvino.genai · GitHub

10 Apr 09:35

Wovchena

2025.1.0.0 Latest

Latest

What's Changed

skip failing Chinese prompt on Win by @pavel-esir in #1573
Bump product version 2025.1 by @akladiev in #1571
Bump tokenizers submodule by @akladiev in #1575
[LLM_BENCH] relax md5 checks and allow pass cb config without use_cb by @eaidova in #1570
[VLM] Add Qwen2VL by @yatarkan in #1553
Fix links, remind about ABI by @Wovchena in #1585
Add nightly to instructions similar to requirements by @Wovchena in #1582
GHA: use nightly from 2025.1.0 by @ilya-lavrenov in #1577
NPU LLM Pipeline: Switch to STATEFUL by default by @dmatveev in #1561
Verify not empty rendered chat template by @yatarkan in #1574
[RTTI] Fix passes rtti definitions by @t-jankowski in #1588
Test add_special_tokens properly by @pavel-esir in #1586
Add indentation for llm_bench json report dumping by @nikita-savelyevv in #1584
prioretize config model type under path-based task determination by @eaidova in #1587
Replace openvino.runtime imports with openvino by @helena-intel in #1579
Add tests for Whisper static pipeline by @eshiryae in #1250
CB: removed handle_dropped() misuse by @ilya-lavrenov in #1594
Bump timm from 1.0.13 to 1.0.14 by @dependabot in #1595
Update samples readme by @olpipi in #1545
[ Speculative decoding ][ Prompt lookup ] Enable Perf Metrics for assisting pipelines by @iefode in #1599
[LLM] [NPU] StaticLLMPipeline: Export blob by @smirnov-alexey in #1601
[llm_bench] enable prompt permutations for prevent prefix caching and fix vlm image load by @eaidova in #1607
LLM: use set_output_seq_len instead of WA by @ilya-lavrenov in #1611
CB: support different number of K and V heads per layer by @ilya-lavrenov in #1610
LLM: fixed Slice / Gather of last MatMul by @ilya-lavrenov in #1616
Switch to VS 2022 by @mryzhov in #1598
Add Phi-3.5-vision-instruct and Phi-3-vision-128k-instruct by @Wovchena in #1609
Whisper pipeline: apply slice matmul by @as-suvorov in #1623
GHA: use OV master in mac.yml by @ilya-lavrenov in #1622
[Image Generation] Image2Image for FLUX by @likholat in #1621
add missed ignore_eos in generation config by @eaidova in #1625
Master increase priority for rt info to fix Phi-3.5-vision-instruct and Phi-3-vision-128k-instruct by @Wovchena in #1626
Correct model name by @wgzintel in #1624
Token rotation by @vshampor in #987
Whisper pipeline: use Sampler by @as-suvorov in #1615
Fix setting eos_token_id with kwarg by @Wovchena in #1629
Extract cacheopt E2E tests into separate test matrix field by @vshampor in #1630
[CB] Split token streaming and generation to different threads for all CB based pipelines by @iefode in #1544
Don't silence a error if a file can't be opened by @Wovchena in #1620
[CMAKE]: use different version for macOS arm64 by @ilya-lavrenov in #1632
Test invalid fields assignment raises in GenerationConfig by @Wovchena in #1633
do_sample=False for NPU in chat_sample, add NPU to README by @helena-intel in #1637
[JS] Add GenAI Node.js bindings by @vishniakov-nikolai in #1193
CB: preparation for relying on KV cache precisions from plugins by @ilya-lavrenov in #1634
[LLM bench]support providing adapter config mode by @eaidova in #1644
Automatically apply chat template in non-chat scenarios by @sbalandi in #1533
beam_search_causal_lm.cpp: delete wrong comment by @Wovchena in #1639
[WWB]: Fixed chat template usage in VLM GenAI pipeline by @AlexKoff88 in #1643
[WWB]: Fixed nano-Llava preprocessor selection by @AlexKoff88 in #1646
[WWB]: Added config to preprocessor call in VLMs by @AlexKoff88 in #1638
CB: remove DeviceConfig class by @ilya-lavrenov in #1640
[WWB]: Added initialization of nano-llava in case of Transformers model by @AlexKoff88 in #1649
WWB: simplify code around start_chat / use_template by @ilya-lavrenov in #1650
Tokenizers update by @ilya-lavrenov in #1653
DOCS: reorganized support models for image generation by @ilya-lavrenov in #1655
Fix using lm_bemch/wwb with version w/o apply_chat_template by @sbalandi in #1651
Fix Qwen2VL generation without images by @yatarkan in #1645
Parallel sampling with threadpool by @mzegla in #1252
[Coverity] Enabling coverity scan by @akazakov-github in #1657
[ CB ] Fix streaming in case of empty outputs by @iefode in #1647
Allow overriding eos_token_id by @Wovchena in #1654
CB: remove GenerationHandle:back by @ilya-lavrenov in #1662
Fix tiny-random-llava-next in VLM Pipeline by @yatarkan in #1660
[CB] Add KVHeadConfig parameters to PagedAttention's rt_info by @sshlyapn in #1666
Bump py-build-cmake from 0.3.4 to 0.4.0 by @dependabot in #1668
pin optimum version by @pavel-esir in #1675
[LLM] Enabled CB by default by @ilya-lavrenov in #1455
SAMPLER: fixed hang during destruction of ThreadPool by @ilya-lavrenov in #1681
CB: use optimized scheduler config for cases when user explicitly asked CB backend by @ilya-lavrenov in #1679
[CB] Return Block manager asserts to destructors by @iefode in #1569
phi3_v: allow images, remove unused var by @Wovchena in #1670
[Image Generation] Inpainting for FLUX by @likholat in #1685
[WWB]: Added support for SchedulerConfig in LLMPipeline by @AlexKoff88 in #1671
Add LongBench validation by @l-bat in #1220
Fix Tokenizer for several added special tokens by @pavel-esir in #1659
Unpin optimum-intel version by @ilya-lavrenov in #1680
Image generation: proper error message when encode() is used w/o encoder passed to ctor by @ilya-lavrenov in #1683
Fix excluding stop str from output for some tokenizer by @sbalandi in #1676
[VLM] Fix chat template fallback in chat mode with defined system message by @yatarkan in https://github.com/openvinotoolkit/openvino.genai/pull/...

Read more

Contributors

dmatveev, ilya-lavrenov, and 45 other contributors

Assets 2

07 Feb 04:52

Wovchena

2025.0.0.0

Please check out the latest documentation pages related to the new openvino_genai package!

Assets 2

19 Dec 13:20

Wovchena

2024.6.0.0

Please check out the latest documentation pages related to the new openvino_genai package!

Assets 2

20 Nov 10:56

Wovchena

2024.5.0.0

Please check out the latest documentation pages related to the new openvino_genai package!

Assets 2

30 Sep 13:19

2024.4.1.0 Pre-release

Pre-release

Please check out the latest documentation pages related to the new openvino_genai package!

What's Changed

Bump OV version to 2024.4.1 by @akladiev in #894
Update requirements.txt and add requirements_2024.4.txt by @wgzintel in #893

Full Changelog: 2024.4.0.0...2024.4.1.0

Contributors

akladiev and wgzintel

Assets 2

23 Sep 08:19

2024.4.0.0

Please check out the latest documentation pages related to the new openvino_genai package!

What's Changed

Support chat conversation for StaticLLMPipeline by @TolyaTalamanov in #580
Prefix caching. by @popovaan in #639
Allow to build GenAI with OpenVINO via extra modules by @ilya-lavrenov in #726
Simplified partial preemption algorithm. by @popovaan in #730
Add set_chat_template by @Wovchena in #734
Detect KV cache sequence length axis by @as-suvorov in #744
Enable u8 KV cache precision for CB by @ilya-lavrenov in #759
Add test case for native pytorch model by @wgzintel in #722
Prefix caching improvements by @popovaan in #758
Add USS metric by @wgzintel in #762
Prefix caching optimization by @popovaan in #785
Transition to default int4 compression configs from optimum-intel by @nikita-savelyevv in #689
Control KV-cache size for StaticLLMPipeline by @TolyaTalamanov in #795
[2024.4] update optimum intel commit to include mxfp4 conversion by @eaidova in #828
[2024.4] use perf metrics for genai in llm bench by @eaidova in #830
Update Pybind to version 13 by @mryzhov in #836
Introduce stop_strings and stop_token_ids sampling params [2024.4 base] by @mzegla in #817
StaticLLMPipeline: Handle single element list of prompts by @TolyaTalamanov in #848
Fix Meta-Llama-3.1-8B-Instruct chat template by @pavel-esir in #846
Add GPU support for continuous batching [2024.4] by @sshlyapn in #858

Full Changelog: 2024.3.0.0...2024.4.0.0

Contributors

ilya-lavrenov, sshlyapn, and 10 other contributors

Assets 2

01 Aug 07:38

Wovchena

2024.3.0.0

Please check out the latest documentation pages related to the new openvino_genai package!

Assets 2

19 Jun 09:37

Wovchena

2024.2.0.0

Please check out the latest documentation pages related to the new openvino_genai package!

Assets 2