Releases: hao-ai-lab/FastVideo
Releases · hao-ai-lab/FastVideo
Release 0.1.6
What's Changed
- [Chore] Include our demo in the readme. by @jzhang38 in #720
- [Feat] Add Wan2.2 14B MoE by @JerryZhou54 in #688
- [Misc] change installation logic of vsa by @jzhang38 in #721
- [3/3][Preprocess] add preprocessing workflows by @Eigensystem in #645
- [Fix] training pipeline pin_cpu_memory issue by @Eigensystem in #692
- add cicd workflow for publishing VSA kernel by @Gary-ChenJL in #723
- Remove all empty_cache by @Edenzzzz in #713
- update version selection for VSA workflow by @Gary-ChenJL in #725
- [bugfix] fix pyproject install and VSA precision test by @SolitaryThinker in #726
- Fix LoRA load from training checkpoint by @Edenzzzz in #719
- [bugfix] [distill] remove i2v validation schema import in distill by @SolitaryThinker in #728
- [feature] add Gradio live serving demo code by @SolitaryThinker in #727
- [bugfix] [dmd] Fix backward simulation and also naming in wan_i2v_dmd_pipeline by @SolitaryThinker in #731
- Fix vsa backward gQ by @jzhang38 in #735
- feat: preprocess validation dataset only when exist by @Eigensystem in #734
- Update WeChat group link by @jzhang38 in #739
- [Feat][Preprocessing] i2v preprocessing workflow by @Eigensystem in #737
- [Docker] add 12.9 docker image and also fix py3.10 and py3.11 dockerfile by @SolitaryThinker in #749
- [bugfix] Missing Docker file for cuda12.9 by @SolitaryThinker in #750
- [bugfix] [dmd] Align backward simulation with dmd2 sample back by @nappengman in #744
- [Fix] fix seed in dmd denoising loop by @jzhang38 in #736
- [bugfix] Check that model_index.json module is in required_modules list before removing by @SolitaryThinker in #756
- Optionally use unmerged weights for inference by @Edenzzzz in #745
- [Feat][Preprocess] support merged dataset by @Eigensystem in #752
- [Feat][Preprocess] support multi-gpus by @Eigensystem in #753
- [misc] [docs] Various fixes for logging and docs by @SolitaryThinker in #758
- [bugfix] Fix wrong HF model string for FastWan2.2 5B by @SolitaryThinker in #763
- Update Community Link by @jzhang38 in #765
- [Feat] Support Self-Forcing's Causal Inference for Wan2.1 T2V 1.3B by @JerryZhou54 in #766
- [Feature] Add wan2.2 5b i2v by @JerryZhou54 in #760
- [chore] Release 0.1.6 by @SolitaryThinker in #768
New Contributors
- @Gary-ChenJL made their first contribution in #723
- @nappengman made their first contribution in #744
Full Changelog: v0.1.5...v0.1.6
Release 0.1.5
What's Changed
- [CI] Add LoRA inference tests by @Edenzzzz in #546
- [bugfix] [training] use separate generator for validation by @SolitaryThinker in #610
- [chore] release 0.1.2 by @SolitaryThinker in #622
- video gen working on apple silicon (addressed issues from prior pr) by @RandNMR73 in #595
- [v0] Remove V0 code by @SolitaryThinker in #621
- [LoRA] Support v1 LoRA training by @Edenzzzz in #576
- Py/add triton block sparse by @jzhang38 in #593
- Fix lora train steps by @Edenzzzz in #627
- [bugfix] fa3 no longer returns lse by @SolitaryThinker in #631
- [CI] Fix CI for pull request targets other than main by @kevin314 in #632
- [bugfix] Fix preprocessing pipelines and nightly tests by @SolitaryThinker in #633
- [Bugfix] Fix LoRA trainable params and training ckpt loading by @Edenzzzz in #630
- [Docs] Docs update for Training and MPS by @SolitaryThinker in #641
- [Feature] Add DMD inference pipeline by @BrianChen1129 in #637
- [Feature] Multi-lora inference by @Edenzzzz in #640
- [Feature] Remove V1 folder by @SolitaryThinker in #642
- [1/3][Preprocess] refactor preprocessing configs by @Eigensystem in #638
- [misc] Use FASTVIDEO_STAGE_LOGGING for perf timing of stage by @SolitaryThinker in #644
- [Feature] Add prompt_txt support for CLI inference; Add DMD CLI inference by @BrianChen1129 in #646
- [core] Add offloading for vae and image encoder and rename offloading args by @SolitaryThinker in #643
- [CI] Add publish workflow for ComfyUI by @kevin314 in #647
- [CI] Fix ComfyUI publisher ID by @kevin314 in #648
- [bugfix] VideoGenerator improperly extracts output_video_name by @SolitaryThinker in #649
- [Feature] Add DMD T2V training pipeline by @BrianChen1129 in #651
- [Feature] Ignore [union-attr] and [override] mypy check and remove from training by @BrianChen1129 in #652
- [Feature] Add Wan-14B-T2V-VSA CLI inference; add master port args by @BrianChen1129 in #653
- [Feature][Distill]Add DMD+VSA joint training example by @BrianChen1129 in #654
- [2/3][Preprocess] refactor pipeline registry & file structure by @Eigensystem in #639
- [Feature][Distill]Add 14B 480p T2V distill example scripts by @BrianChen1129 in #655
- [Feat] Support VSA with any resolution. by @jzhang38 in #650
- [Bugfix]Fix DMD wan pipeline by @BrianChen1129 in #659
- [Bugfix]Fix mdoel inference checkpoint saving when enabling HSDP by @BrianChen1129 in #660
- [Feature] Add DMD CI test by @BrianChen1129 in #661
- [Feature]Add DMD distillation training resume checkpoint; Update DMD CI test by @BrianChen1129 in #662
- [Feature] Add wan2.2 5B T2V by @SolitaryThinker in #658
- [ComfyUI] Add init.py for node discovery by @kevin314 in #663
- [BUG] Fix distillation + vsa by @jzhang38 in #665
- [Feature]Add VSA slurm training example scripts by @BrianChen1129 in #666
- [chore] Release 0.1.4 by @kevin314 in #667
- [Bugfix][Training]Fix Wan2.2 training vae config issue by @BrianChen1129 in #668
- [Feature] [Inference]Add ROCm platform support for single-gpu inference by @sopiko99 in #669
- [Bugfix]Fix DMD pipeline registry by @BrianChen1129 in #670
- Modify args to make sure the scripts are runnable on 4090 by @JerryZhou54 in #671
- [Misc] Update examples/ and other misc by @jzhang38 in #672
- [Feature]Add DMD visualization for debugging by @BrianChen1129 in #674
- fix _normalize_dit_input by @MartinPernus in #681
- [Bugfix] Fix multi-gpu training lr_scheduler by @BrianChen1129 in #682
- [Misc] Fix training scripts by @Edenzzzz in #683
- [Bugfix] Add i2v vae loading by @BrianChen1129 in #686
- [Feature] Optionally enable torch compile by @Edenzzzz in #684
- [Feature[[Readme] Add VSA/DMD doc by @BrianChen1129 in https://github.com//pull/673
- [Feature] Add Wan2.2-TI2V-5B Sparse Distill by @BrianChen1129 in #690
- [config] Add config for FastWan2.2 ti2v 5B by @SolitaryThinker in #693
- [Feature] Add Wan2.2 DMD example files; Update lr scheduler by @BrianChen1129 in #694
- [Feature] Remove unused args by @BrianChen1129 in #695
- [Docs] Update README and docs for FastWan by @SolitaryThinker in #698
- [Feature] Update sparse distill readme and doc by @BrianChen1129 in #700
- [misc] Readme fixes by @SolitaryThinker in #699
- [Feature] Update readme by @BrianChen1129 in #702
- [Docs] Fix README by @SolitaryThinker in #701
- Update readme pre-release by @zhisbug in #704
- [Feature] Update cites by @BrianChen1129 in #703
- [Feature]Update Wan2.2+DMD doc example by @BrianChen1129 in #706
- [misc] Remove allow_tf32 in scripts by @Edenzzzz in #705
- Add WeChat group link by @jzhang38 in #707
- [Bugfix] Fix neg_prompt bug when training from local cp by @BrianChen1129 in #708
- Fix typo by @BrianChen1129 in #709
- [Feature]Add Data-free distillation readme by @BrianChen1129 in #710
- [chore] Release 0.1.5 by @SolitaryThinker in #717
New Contributors
- @RandNMR73 made their first contribution in #595
- @sopiko99 made their first contribution in #669
- @MartinPernus made their first contribution in #681
- @zhisbug made their first contribution in #704
Full Changelog: v0.1.2...v0.1.5
v0.1.2
Last release before removal of v0 code.
What's Changed
- Fix VAE precisions by @Edenzzzz in #588
- [LoRA] Fix lora merge weights by @Edenzzzz in #579
- Add ComfyUI custom node for inference by @kevin314 in #596
- [Feature] Offload all text encoders by default by @Edenzzzz in #594
- [Training] Use inference pipeline for training validation by @SolitaryThinker in #585
- [chore] Upgrade min Python version from 3.8 to 3.10 by @SolitaryThinker in #597
- [bugfix] [training] fix deadlock in latent datasets and init error in multi-node training by @SolitaryThinker in #598
- [docs] Update slack invite by @SolitaryThinker in #601
- [docs] update dev guide runpod image to py3.12 by @SolitaryThinker in #602
- Remove all unnecessary torch.cuda.empty_cache by @Edenzzzz in #606
- Set encoder TP size to 1 by default by @Edenzzzz in #569
- [Feature][Training]Update example fine-tuning scripts to enable gradient checkpointing by @BrianChen1129 in #618
Full Changelog: v0.1.1...v0.1.2
v0.1.1
What's Changed
- [Docs] Add CLI docs by @SolitaryThinker in #406
- [Docs] Fix image by @SolitaryThinker in #407
- [Teacache] allow None for forward_context batch when using teacache by @SolitaryThinker in #412
- [V1] Remove vLLM dependency by @SolitaryThinker in #413
- Fulfill worker response on interrupt by @kevin314 in #417
- [bug] fix bs > 1 by @SolitaryThinker in #418
- Fix version number by @Edenzzzz in #422
- [Tests] don't run 3.10 and 3.11 for SSIM by @SolitaryThinker in #427
- Use version.py by @Edenzzzz in #424
- Unify env report script in issue template by @Edenzzzz in #423
- Set device for encode by @kevin314 in #420
- [Misc] Small fixes to Torch code by @applesaucethebun in #395
- misc: Trigger transformers CI for layers and attention code change by @Edenzzzz in #434
- [Training] [2/n] add bwd for all2all and all_gather by @SolitaryThinker in #439
- [Training] [3/n] Add training args and dependencies by @SolitaryThinker in #440
- [Training] [4/n] add training save checkpoint by @SolitaryThinker in #441
- [Training] [1/n] Add latent datasets by @SolitaryThinker in #438
- Update STA mask strategy downloading by @BrianChen1129 in #445
- [Training] [5/n] Add single gpu training pipeline by @SolitaryThinker in #447
- [Training] [0/n] Add preprocessing pipeline by @JerryZhou54 in #442
- [Training] [6/n]Mixed precision training by @SolitaryThinker in #448
- [Training] [7/n] gradient clipping by @SolitaryThinker in #449
- [Training] [8/n] SP Training by @SolitaryThinker in #450
- misc: add remote pdb for debugging workers by @Edenzzzz in #456
- [Misc] Remove InferenceEngine by @Edenzzzz in #455
- [Misc] disable cast_forward_inputs by @SolitaryThinker in #460
- Bring back mask files under asset/ and update new Wan mask strategy file by @BrianChen1129 in #462
- Fix WanVideo by @JerryZhou54 in #461
- [Training] Add distributed checkpointing by @kevin314 in #458
- Update v1 inference scripts by @JerryZhou54 in #467
- [Training] Support Multi-Node training with FSDP + SP by @SolitaryThinker in #459
- [misc] Polish V1 training code by @Edenzzzz in #469
- [misc] Find unused port in distributed init by @Edenzzzz in #475
- [LoRA] Support V1 LoRA inference by @Edenzzzz in #451
- [bugfix] fix bz >1 for training by @SolitaryThinker in #477
- [Issue template] Move env report to the end for readability by @Edenzzzz in #476
- [Preprocess] I2V dataset by @BrianChen1129 in #473
- [Distill] support distill for wan by @AliceChenyy in #444
- [STA] Implement mask search and update mask strategy for V1's Wan2.1 by @KevinZeng08 in #415
- [bugfix] [training] Add negative prompt to preprocessing and validation by @jzhang38 in #479
- [bugfix] [misc] fix denoising stage init; rename distributed env function; fix logging. by @jzhang38 in #481
- Add torch.compile for all small ops by @Edenzzzz in #432
- Revert "Add torch.compile for all small ops" by @Edenzzzz in #484
- [Bug] Fix multi gpus issues in v1 scripts by @BrianChen1129 in #489
- [misc] Improve distributed related env variables and setup by @jzhang38 in #487
- [bugfix][Cli Inference] Resolve runtime errors when running fastvideo generate by @JerryZhou54 in #493
- Fix pre-commit CI by @Edenzzzz in #494
- [bugfix][Cli Inference] Resolve runtime errors when running fastvideo generate by @JerryZhou54 in #495
- [Feature] Adding VSA inference by @BrianChen1129 in #478
- [misc] Add missing license headers by @SolitaryThinker in #499
- [Feat][Dataloader] 1/n Refactor parquet map-style dataloader by @jzhang38 in #492
- [Feature][VSA]Update STA publish workflow by @BrianChen1129 in #498
- [misc] rename dp_size to hdsp_replicate_dim by @jzhang38 in #491
- [CI] [Training] Initial e2e small training test by @SolitaryThinker in #504
- [feat] Add parquet iterable dataset. by @jzhang38 in #506
- [Refactor][Configurations] clean config orgnization by @Eigensystem in #505
- fix logging by @jzhang38 in #509
- [CI] Restrict training CI to v1 by @Edenzzzz in #508
- [misc] Fix preprocessing and dataloader extra padding by @jzhang38 in #514
- [Feature][Training]vsa for t2v training ready by @BrianChen1129 in #513
- [Feature][Preprocess]Add Readme doc for preprocess by @BrianChen1129 in #518
- [CI] [Training] drop negative prompt in validation dataset and CI test for preprocess + training overfit by @SolitaryThinker in #519
- [Bugfix][Preprocess]fix mini dataset name by @BrianChen1129 in #520
- [Refactor] Fix attn backend selection not correctly setting env variable by @Edenzzzz in #516
- [misc] [ci] fix e2e preprocess+training data path by @SolitaryThinker in #521
- [bugfix] [Training] use diffusers fp32layernorm for wan2.1 by @SolitaryThinker in #490
- [CI] Update Docker image to flash-attn 2.8.0 / CUDA 12.8 by @kevin314 in #524
- [CI] Add current PR test workflow to Buildkite/Modal by @kevin314 in #512
- [CI][bugfix] Use new 3.12 docker image by @SolitaryThinker in #526
- [Bugfix][Inference]Fix envs.attn_backend by @BrianChen1129 in #525
- [Ci] add sta and vsa install to docker image by @SolitaryThinker in #528
- [Feature][CI]Add STA-inference/VSA-training test by @BrianChen1129 in #527
- [Bugfix][Readme]Fix readme website bugs and add VSA finetune docs by @BrianChen1129 in #531
- [Refactor] Move dict_to_3d_list under utils by @Edenzzzz in #507
- Specify cu128 Pytorch installation by @kevin314 in #530
- [Feat] Add Stage input and output verification by @SolitaryThinker in #523
- [misc] Remove gradient checking code by @SolitaryThinker in #532
- [bugfix] Fix stage validator for multi text encoder models by @SolitaryThinker in #535
- [bugfix] [VSA] Fix layernorm type for VSA Wan2.1 TransformerBlock by @SolitaryThinker in #534
- [misc] [training] Reorganize training pipeline by @SolitaryThinker in #533
- [chore] Bump torch to 2.7.1 to support Blackwell by @Edenzzzz in #483
- [Training] Refactor and improve validation datasets by @SolitaryThinker in #539
- [Feature][Training]Add diffusers format checkpoint saving for inference by @BrianChen1129 in #542
- [Kernel] Remove all syncs from STA & VSA kernels by @Edenzzzz in #517
- [CI] Fix CI checks by @Edenzzzz in #553
- [Feature][Training] Add cfg rate for dataset loader by @BrianChen1129 in ht...
v0.1.0
What's Changed
- [V1] Update README by @SolitaryThinker in #400
- [CLI] Default to pipeline config by @kevin314 in #401
- [V1] Docs Update by @SolitaryThinker in #402
- [V1] Update where num_frame rounding is done by @SolitaryThinker in #403
- Release 0.1.0 by @SolitaryThinker in #405
Full Changelog: v0.0.5...v0.1.0
v0.0.5
What's Changed
- Syn main with yongqi-dev2 by @BrianChen1129 in #70
- [cleanup] by @jzhang38 in #72
- add web demo by @BrianChen1129 in #73
- Cleanup by @jzhang38 in #75
- Cleanup by @jzhang38 in #77
- Hunyuanvideo by @jzhang38 in #78
- add hunyuan adv by @jzhang38 in #79
- update release readme by @foreverpiano in #81
- Cleanup README. by @jzhang38 in #83
- Clean up by @jzhang38 in #84
- Rlsu lora readme by @jzhang38 in #85
- Rlsu lora readme by @jzhang38 in #86
- Add Replicate demo and API by @lucataco in #93
- fix lora checkpoint saving issue by @BrianChen1129 in #97
- [feat]:Single 4090 inference for fasthunyuan by @jzhang38 in #104
- [Minor] Adding issue template. by @foreverpiano in #114
- [feat]: Add format auto fixer to main branch by @rlsu9 in #124
- [Fix] Save CK, Dataset bug fix by @jzhang38 in #125
- [feat]: Add tests for FastVideo by @rlsu9 in #127
- add parallel for vae decoding by @rucnyz in #134
- Update README.md by @foreverpiano in #131
- adding hunyuan hf (support lora finetuning); unified hunyuan hf inference with quantization by @BrianChen1129 in #135
- Lora README update by @BrianChen1129 in #155
- Create config.yml by @foreverpiano in #152
- add sliding tile attn by @jzhang38 in #182
- Add STA and teacache forward by @BrianChen1129 in #184
- fix kernel issue by @BrianChen1129 in #185
- Infer sta tea with torch.compile by @BrianChen1129 in #190
- [feat]: fix readme demo and add video to readme by @rlsu9 in #191
- update env by @jzhang38 in #194
- Update Cite by @jzhang38 in #195
- Update typo by @jzhang38 in #198
- fix ori hunyuan inference issue by @BrianChen1129 in #199
- Add StepVideo by @jzhang38 in #200
- [feat] fix isort format by @rlsu9 in #203
- [FIX] Make STA optinal by @jzhang38 in #204
- Update readme by @BrianChen1129 in #202
- Update STA README.md by @jzhang38 in #206
- Added multi-GPU support for Hunyuan STA by @BrianChen1129 in #211
- fix train/distill issue by @BrianChen1129 in #215
- update cfg bug? by @jzhang38 in #223
- fix training mask strategy issue by @BrianChen1129 in #248
- Establish cicd workflow to build and publish FastVideo and STA Kernel by @PorridgeSwim in #227
- v1 by @jzhang38 in #270
- Set up text encoder tests to work with pytest and Github Actions by @kevin314 in #302
- [CI] Add test workflow improvements by @kevin314 in #311
- refactor the env setup and install of fastvideo by @PorridgeSwim in #309
- Add ssim test by @kevin314 in #314
- Fix sdpa by @jzhang38 in #315
- Add torch sdpa backend to ssim test by @SolitaryThinker in #316
- [CI] Add manual triggers for PR workflow by @kevin314 in #320
- [Docs] Initial Docs Build by @SolitaryThinker in #322
- [CI] Use pre-commit to run linter by @SolitaryThinker in #321
- [Docs] Fix doc lint by @SolitaryThinker in #325
- [CI] Set allowedCudaVersions by @kevin314 in #329
- [Docs] Add dev guide and doc building CI by @SolitaryThinker in #330
- Port tests to v1 by @kevin314 in #333
- V1 wan rebased by @SolitaryThinker in #335
- [Model] Remove RMSNorm's forward_native hardcode from Wan by @SolitaryThinker in #339
- [Docs] Initial examples setup and more docs by @SolitaryThinker in #332
- [CI] Support custom Docker image by @kevin314 in #342
- Add STA to V1 by @jzhang38 in #312
- [STA] Sta release 0.0.3 by @SolitaryThinker in #344
- [CI] Free up runner disk for sta-publish by @kevin314 in #345
- [CI] Add manual trigger to sta-publish and fastvideo-publish by @kevin314 in #346
- Pipeline config by @JerryZhou54 in #343
- [CI] Docker image improvements by @kevin314 in #350
- Default to using original WanVAE's encoding/decoding algorithm by @JerryZhou54 in #351
- [CLI] Fix duplicate --num-gpus by @kevin314 in #352
- add STA to Wan v1 by @BrianChen1129 in #349
- [Docs] Fix developer guide images by @kevin314 in #353
- [1/n] [v1] Add Worker abstractions for User API by @SolitaryThinker in #336
- [sta] release 0.0.4 by @SolitaryThinker in #354
- [V1] Worker cleanup; Logging clean up; enables isort again by @SolitaryThinker in #355
- [V1] Process aware logging; improve logging msg by @SolitaryThinker in #356
- [V1] Gradio demo with new API by @kevin314 in #357
- chore: Release FastVideo 0.0.2 and update python requirements by @SolitaryThinker in #360
- [V1] Worker improvements/cleanup by @kevin314 in #361
- [Docs] Docs for design and adding new pipeline by @SolitaryThinker in #363
- [Attn] Add SageAttention Backend by @SolitaryThinker in #366
- Model config by @JerryZhou54 in #358
- Update SSIM tests to use new API by @kevin314 in #369
- Refactor encoder by @JerryZhou54 in #370
- Fix model config for python 3.11+ by @SolitaryThinker in #373
- release 0.0.3 by @SolitaryThinker in #374
- change gradio example to use model configs by @SolitaryThinker in #375
- Fix FSDP issues when using cpu_offload flag by @JerryZhou54 in #376
- [CI] Add new images for different Python versions by @kevin314 in #377
- [CI] Add write permissions to build-image workflow by @kevin314 in #379
- [Lint] fix by @SolitaryThinker in #382
- Add Teacache to V1 by @SolitaryThinker in #371
- Cleanup Teacache params by @SolitaryThinker in #386
- Small Fixes & Features by @JerryZhou54 in #378
- [Docs] Update for V1 by @SolitaryThinker in #381
- [misc] Improve worker cleanup by @SolitaryThinker in #387
- Release 0.0.4 by @SolitaryThinker in #388
- [Misc] Small Fixes & Features by @JerryZhou54 in #390
- [Docs] Add collect_env.py and various docs update by @SolitaryThinker in #393
- [CI] Use python 3.10/3.11 for SSIM test by @kevin314 in #392
- [CLI] Update cli to support new api/mo...