Releases · linkedin/Liger-Kernel

22 May 17:52

shimizust

v0.5.10

44a8f2f

v0.5.10: Qwen3 MOE support, Sparsemax kernel, bug fixes Latest

Latest

What's Changed

fix zip bug by @KareemMusleh in #702
[dpo] set default average_log_prob to False by @cyr0930 in #693
Rank build status lower by @momochen in #707
Add support for Qwen3 MoE models by @chiwanpark in #706
Fix qwen3_moe flaky convergence test by @vaibhavjindal in #710
Fix empty Medusa head tensors by @chiwanpark in #698
Sparsemax by @AndreSlavescu in #687
fix: remove docstring imports in transformer patches by @NanoCode012 in #712
Increase tests timeout to 45 mins by @vaibhavjindal in #718
fix modal tests by @shivam15s in #719
Visualizer Update by @AndreSlavescu in #717
Sparsemax Documentation by @AndreSlavescu in #716
element-wise-DyT faster than the origin LigerDyT by @mdy666 in #673
GRPO Loss kernel fully write by triton, reduce 46G memory by @mdy666 in #672
Make FLCE compatible with FSDP and PEFT by @astefanutti in #674
Fix incorrect module patching when using LoRA with modules_to_save by @BenasdTW in #632
[XPU] Changed how XPU discovery works during setup.py by @Egor-Krivov in #720
Fix to publish docs on pushes to main branch by @shimizust in #722
Release 0.5.10 by @shimizust in #725

New Contributors

@KareemMusleh made their first contribution in #702
@cyr0930 made their first contribution in #693
@NanoCode012 made their first contribution in #712
@mdy666 made their first contribution in #673
@astefanutti made their first contribution in #674
@Egor-Krivov made their first contribution in #720

Full Changelog: v0.5.9...v0.5.10

Contributors

astefanutti, momochen, and 11 other contributors

Assets 2

04 May 19:47

shivam15s

v0.5.9

f19068f

v0.5.9: Adds XPU Setup, GLM-4 & Qwen3 Model Support, Key Bugfixes

What's Changed

update setup.py for installation on xpu by @faaany in #668
update XPU CI yaml file to use docker container by @faaany in #669
Add average_log_prob as an init param for LigerFusedLinearDPOLoss by @vaibhavjindal in #676
add shift label change by @shivam15s in #683
remove tests that can pass on XPU by @faaany in #686
Update mkdocs.yml by @shivam15s in #691
Fix LigerCrossEntropy reduction='none' by @Tcc0403 in #680
Support GLM-4 models by @intervitens in #685
Import glm4_lce_forward locally in function by @vaibhavjindal in #695
Qwen3 model support by @vaibhavjindal in #692
Use logits_to_keep logic for training runs by @vaibhavjindal in #696
increase gemma3 multimodal convergence test loss atol by @shivam15s in #697
Update pyproject.toml by @shivam15s in #700

New Contributors

@intervitens made their first contribution in #685

Full Changelog: v0.5.8...v0.5.9

Contributors

faaany, vaibhavjindal, and 3 other contributors

Assets 2

12 Apr 16:44

shivam15s

v0.5.8

43d0ac1

v0.5.8: Backward-Compatible Fix

What's Changed

backward compatible initialization by @shivam15s in #666
Update pyproject.toml by @shivam15s in #667

Full Changelog: v0.5.7...v0.5.8

Contributors

shivam15s

Assets 2

12 Apr 00:49

shivam15s

v0.5.7

cdd8e74

v0.5.7: Gemma3 Support, XPU Tuning Enhancements, GRPO Improvements, and API Compatibility Fixes

What's Changed

Gemma3 (Text and Multimodal) by @eljandoubi in #621
Make FLCE compatible with latest XXXForCausalLM.forward() APIs by @Tcc0403 in #596
do bias addition in tests in float32 to make testing code similar to torch compile by @shivam15s in #655
[CI] fix siglip dummy config by @yundai424 in #658
add XPU tuning to JSD by @rmukhopa in #649
add XPU tuning to Rmsnorm and Layernorm by @Tarakarevu1 in #653
Fix imports without transformers by @vaibhavjindal in #659
Use TYPE_CHECKING to fix static-only imports in IDEs etc by @vaibhavjindal in #660
[kl_div] Modified block and warp sizes for improved performance by @jgtong in #654
[GRPO] add support for different loss types by @kashif in #662
Remove unexpected kwargs passing to flce by @Tcc0403 in #651
reduce number of tests for grpo by @shivam15s in #663
Update pyproject.toml by @shivam15s in #665

New Contributors

@rmukhopa made their first contribution in #649
@Tarakarevu1 made their first contribution in #653
@jgtong made their first contribution in #654

Full Changelog: v0.5.6...v0.5.7

Contributors

kashif, vaibhavjindal, and 7 other contributors

Assets 2

02 Apr 21:55

shivam15s

v0.5.6

c3c2d4f

v0.5.6: Enhancements, Fixes, and Expanded Support (Paligemma, DyT, XPU, Llava, GRPO, and More!)

What's Changed

[JSD] JSD fixes by @kashif in #609
Paligemma support by @eljandoubi in #608
Fix hidden size by @eljandoubi in #612
Add loss_utils for rewriting lce_forward methods by @Tcc0403 in #614
Update Star History URL by @ryankert01 in #616
Update README.md by @shivam15s in #617
language model of paligemma 1 is gemma 1. by @eljandoubi in #613
Update README to reflect recent changes by @helloworld1 in #619
Support Dynamic Tanh (DyT) by @Tcc0403 in #618
Fix incorrect module name when monkey_patch applied to instantiated model by @vaibhavjindal in #629
[chunked loss] align teacher and student logit shape by @yundai424 in #634
Fix incorrect condition comment in log_target calculation by @p81sunshine in #633
Add huggingface llava by @jp1924 in #524
fix Llava test-bwd failure by @jp1924 in #639
Fix GRPO to conform with TRL: Fix loss, make tests accurate, correct metrics computation by @shivam15s and @mRSun15 in #628
add xpu tuning to CE by @mgrabban in #645
add xpu tuning to FLJSD by @mgrabban in #647
Change tests to use rocm 6.3 version and tol changes to make liger run on amd by @shivam15s in #646
Update pyproject.toml by @shivam15s in #648

New Contributors

@eljandoubi made their first contribution in #608
@p81sunshine made their first contribution in #633

Full Changelog: v0.5.5...v0.5.6

Contributors

kashif, helloworld1, and 10 other contributors

Assets 2

14 Mar 00:27

shivam15s

v0.5.5

a6dc70d

v0.5.5: Chunk size fixes for JSD; KTO speed fixes; better metrics tests

What's Changed

Infer correct device for AMD HIP device by @helloworld1 in #587
add out of bounds check to cross entropy by @shivam15s in #588
Monkeypatch for Qwen2.5-VL by @BenasdTW in #552
KTO changes to return aux outputs by @vaibhavjindal in #589
[KTO] Only return summed metrics by @vaibhavjindal in #591
increase chunk size for distillation and add bias to jsd by @shivam15s in #590
[CI] Add ROCm 6.3 CI by @tjtanaa in #506
Fix KTO speed issue by @vaibhavjindal in #592
Compare means of aggregated outputs in KTO tests by @vaibhavjindal in #595
Fix means of logps and rewards by @vaibhavjindal in #597
Add chunk_size param to chunked losses by @RichhLi in #599
Fix DPO/ORPO typo in readme by @tyler-romero in #602
version bump by @shivam15s in #605

New Contributors

@RichhLi made their first contribution in #599

Full Changelog: v0.5.4...v0.5.5

Contributors

helloworld1, tyler-romero, and 5 other contributors

Assets 2

24 Feb 21:59

yundai424

v0.5.4

911db5d

v0.5.4: Granite 3.0 & 3.1, OLMo2, GRPO, TVD loss, and minor fixes

What's Changed

add GitHub CI for Intel GPU by @faaany in #536
Add Intel GPU CI to README.md by @hebiao064 in #562
test split to 16, 32 by @jp1924 in #564
Clean up workaround introduced in PR #564 by @austin362667 in #566
Update README.md by @momochen in #567
Grpo loss by @kashif in #553
Update Readme with ROCM installation instruction by @zcnrex in #570
fix qwen2vl and mllama test to pass failing tests by @shivam15s in #571
KTO: Minor fix and documentation update by @vaibhavjindal in #574
Add TVD Loss Kernel by @saurabhkoshatwar in #324
Add KTO Benchmark Data into README by @hebiao064 in #575
Support Granite 3.0 and 3.1 models by @JamesKunstle in #558
Improve Hugging Face SFT Script by @ParagEkbote in #539
Add unit tests for shared prefix masked attention with torch.FlexAttention by @austin362667 in #504
update project readme to include Granite support by @JamesKunstle in #576
Revert "Improve Hugging Face SFT Script (#539)" and Fix TVD Test for Intel #580 by @shivam15s in #578
Fix Rope Test by @hebiao064 in #577
Fix layer norm kernels by @lancerts in #582
Add OLMO2 model support by @yundai424 in #581
bump version to 0.5.4 by @yundai424 in #585

New Contributors

@jp1924 made their first contribution in #564
@zcnrex made their first contribution in #570
@vaibhavjindal made their first contribution in #574
@saurabhkoshatwar made their first contribution in #324
@JamesKunstle made their first contribution in #558

Full Changelog: v0.5.3...v0.5.4

Contributors

kashif, momochen, and 12 other contributors

Assets 2

10 Feb 23:29

shivam15s

v0.5.3

80b409a

v0.5.3: Minor fixes for post-training losses and support for KTO Loss

What's Changed

Add ref_input parameter to support separate inputs for reference model by @xingyaoww in #467
Revert "Add ref_input parameter to support separate inputs for reference model" by @ByronHsu in #469
Add dynamic dependency management for CUDA and ROCm by @hebiao064 in #460
[CI] runtime pip install using uv by @ByronHsu in #471
modify ref_input in chunked_loss base class and fix tests by @shivam15s in #470
Add more post training in readme by @ByronHsu in #472
align post training loss at the center by @ByronHsu in #473
[Transformer] fix ORPO loss for MOE models by @kashif in #479
fix: correct typos in docstrings by @shivam15s in #482
fix chosen_nll_loss in chunked losses by @kashif in #486
Revert "fix chosen_nll_loss in chunked losses (#486)" by @shivam15s in #489
fix dpo tests: reduce tolerance and change default compute_nll_loss false by @shivam15s in #490
CPO & SimPO add label_smoothing by @Mecoli1219 in #493
Fix Preference Loss and Refactor for Readability by @austin362667 in #484
annotate tl constexpr values by @winglian in #497
Fix Rope Compatibility with Cos/Sin Position Embedding for Batch Size > 1 by @wizyoung in #477
Move the checkstyle to Ruff by @shivam15s in #483
Fix/liger fused linear cross entropy function does not support reduction=none by @ryankert01 in #496
Fix Dtype Mismatch in torch.addmm within ops/fused_linear_cross_entropy.py in AMP training. by @DandinPower in #502
Add weight support for LigerCrossEntropy by @Tcc0403 in #420
Refactor Temperature Scaling in Distillation Loss by @austin362667 in #444
Fix All chunked_loss Benchmark Scripts by @austin362667 in #438
Set z_loss_1d=None when return_z_loss=False in cross_entropy_loss to avoid tl.store fail when triton_interpret=1(for tl.device_print etc.) by @wa008 in #508
Add aux_outputs for CPO and SimPO by @Mecoli1219 in #492
Add average_log_prob args for cpo by @Mecoli1219 in #510
Refactor CrossEntropy and FusedLinearCrossEntropy by @Tcc0403 in #511
[ORPO] add nll_target for orpo nll loss by @kashif in #503
Format Benchmark Scripts with Ruff by @austin362667 in #516
[Tiny] Add QVQ to readme by @tyler-romero in #522
Add argument return_z_loss to flce by @Tcc0403 in #530
Remove extra print by @apaz-cli in #531
Fix HF transformers Breaking Changes by @austin362667 in #526
Handle cache_position for transformers 4.47.0 and later (#528) by @BenasdTW in #529
Create Docs for Liger-Kernel by @ParagEkbote in #485
Add Mkdocs related dependencies to setup.py by @hebiao064 in #534
Add KTO Loss by @hebiao064 in #475
[tests] use a valid hexadecimal string instead of a placeholder by @faaany in #535
[tests] skip failed tests for xpu by @faaany in #498
Format files by @austin362667 in #541
Fix Broken Links by @ParagEkbote in #547
[Fix] Fix the type hint of test_utils::concatenated_forward by @hongpeng-guo in #549
Add JSD Loss for Distillation by @austin362667 in #425
[DPO] add reference log-prob outputs in DPO by @kashif in #521
Fix DPO unit test fail and refactor by @Tcc0403 in #554

New Contributors

@xingyaoww made their first contribution in #467
@kashif made their first contribution in #479
@Mecoli1219 made their first contribution in #493
@winglian made their first contribution in #497
@DandinPower made their first contribution in #502
@wa008 made their first contribution in #508
@apaz-cli made their first contribution in #531
@BenasdTW made their first contribution in #529
@ParagEkbote made their first contribution in #485

Full Changelog: v0.5.2...v0.5.3

Contributors

kashif, winglian, and 17 other contributors

Assets 2

11 Dec 05:58

ByronHsu

v0.5.2

966eb73

v0.5.2: Fix Qwen2VL mrope for transformer>=4.47

What's Changed

Disable Qwen2 VL test for with logits conv test by @ByronHsu in #463
Fix Qwen2VL mrope for transformers 4.47.0 by @li-plus in #464
Revert Workaround of Disabling QWEN2_VL in Convergence Tests by @austin362667 in #466

Full Changelog: v0.5.1...v0.5.2

Contributors

ByronHsu, austin362667, and li-plus

Assets 2

10 Dec 09:30

ByronHsu

v0.5.1

62a3c7d

v0.5.1: Patch Fix Import Error

What's Changed

Fix liger orpo trainer import error by @ByronHsu in #459
Update pyproject.toml by @ByronHsu in #462

Full Changelog: v0.5.0...v0.5.1

Contributors

ByronHsu

Assets 2

Releases: linkedin/Liger-Kernel

v0.5.10: Qwen3 MOE support, Sparsemax kernel, bug fixes

What's Changed

New Contributors

Contributors

Uh oh!

v0.5.9: Adds XPU Setup, GLM-4 & Qwen3 Model Support, Key Bugfixes

What's Changed

New Contributors

Contributors

Uh oh!

v0.5.8: Backward-Compatible Fix

What's Changed

Contributors

Uh oh!

v0.5.7: Gemma3 Support, XPU Tuning Enhancements, GRPO Improvements, and API Compatibility Fixes

What's Changed

New Contributors

Contributors

Uh oh!

v0.5.6: Enhancements, Fixes, and Expanded Support (Paligemma, DyT, XPU, Llava, GRPO, and More!)

What's Changed

New Contributors

Contributors

Uh oh!

v0.5.5: Chunk size fixes for JSD; KTO speed fixes; better metrics tests

What's Changed

New Contributors

Contributors

Uh oh!

v0.5.4: Granite 3.0 & 3.1, OLMo2, GRPO, TVD loss, and minor fixes

What's Changed

New Contributors

Contributors

Uh oh!

v0.5.3: Minor fixes for post-training losses and support for KTO Loss

What's Changed

New Contributors

Contributors

Uh oh!

v0.5.2: Fix Qwen2VL mrope for transformer>=4.47

What's Changed

Contributors

Uh oh!

v0.5.1: Patch Fix Import Error

What's Changed

Contributors

Uh oh!