|
34 | 34 | - **Testing & Documentation**: UniDiffusers test (#1007), 'reuse a pipeline' docs (#989), diffusers mint changes (#992) |
35 | 35 |
|
36 | 36 | #### model components |
37 | | -- **Video Transformers**: transformer_qwenimage, transformer_hidream_image, transformer_wan_vace, transformer_skyreels_v2, transformer_chroma, transformer_cosmos, transformer_hunyuan_video_framepack, consisid_transformer_3d |
38 | | -- **Autoencoders**: autoencoder_kl_qwenimage, autoencoder_kl_cosmos |
| 37 | +- **Video Transformers**: transformer_qwenimage (#1288), transformer_hidream_image, transformer_wan_vace (#1148), transformer_skyreels_v2, transformer_chroma, transformer_cosmos, transformer_hunyuan_video_framepack, consisid_transformer_3d |
| 38 | +- **Autoencoders**: autoencoder_kl_qwenimage (#1288), autoencoder_kl_cosmos |
39 | 39 | - **ControlNets**: controlnet_sana, multicontrolnet_union |
40 | 40 | - **Processing Modules**: cache_utils, auto_model, lora processing modules |
41 | | -- **Integration Components**: Model components for QwenImage, HiDream, Wan-VACE, SkyReels-V2, Chroma, Cosmos, HunyuanVideo, Sana, and other pipelines |
| 41 | +- **Integration Components**: Model components added as part of pipeline implementations |
42 | 42 |
|
43 | 43 | ### mindone.peft |
44 | 44 | - Added mindone.peft and upgraded to v0.15.2 (#1194) |
45 | 45 | - Added Qwen2.5-Omni LoRA finetuning script with transformers 4.53.0 (#1218) |
46 | 46 | - Fixed lora and lora_scale from each PEFT layer (#1187) |
47 | 47 |
|
48 | 48 | ### models under examples (mostly with finetune/training scripts) |
49 | | -- Added Janus model for unified understanding and generation |
50 | | -- Added Emu3 model for multimodal tasks |
| 49 | +- Added Janus model for unified understanding and generation (#1378) |
| 50 | +- Added Emu3 model for multimodal tasks (#1233) |
51 | 51 | - Added VAR model for class-conditional image generation |
52 | 52 | - Added HunyuanVideo and HunyuanVideo-I2V models |
53 | | -- Added Wan2.1 and Wan2.2 models for text/image-to-video generation |
54 | | -- Added OpenSora models (PKU and HPC-AI versions) |
55 | | -- Added MovieGen 30B model |
| 53 | +- Added Wan2.1 model for text/image-to-video generation (#1363) |
| 54 | +- Added Wan2.2 model for text/image-to-video generation (#1243) |
| 55 | +- Added OpenSora models (PKU and HPC-AI versions) (#687) |
| 56 | +- Added MovieGen 30B model (#1362) |
56 | 57 | - Added Step-Video-T2V model |
57 | 58 | - Added CogView4 model for text-to-image generation |
58 | | -- Added OmniGen and OmniGen2 models |
59 | | -- Added CannyEdit for image editing tasks |
| 59 | +- Added OmniGen and OmniGen2 models (#1227) |
| 60 | +- Added CannyEdit for image editing tasks (#1346) |
60 | 61 | - Added SparkTTS for text-to-speech synthesis |
61 | | -- Added SAM2 for image segmentation |
62 | | -- Added LangSAM for language-guided segmentation |
| 62 | +- Added SAM2 for image segmentation (#1200) |
| 63 | +- Added LangSAM for language-guided segmentation (#1369) |
63 | 64 | - Added MMaDA for multimodal generation |
64 | 65 |
|
65 | 66 | ### Changed |
|
0 commit comments