πππA curated list of papers on controllable video generation. Please join us for more comprehensive summary. If you have any additions to the list, please raise them in the issue section. ζ¬’θΏθ‘₯ε π
- π Structure Control
- 𧬠ID Control
- πΌοΈ Image Control
- β³ Temporal Control
- π Audio Control
- π§© Other Controls
- π Universal Control
- β Paper searching via catatogue: directly clicking the content of the catatogue to select the area of your research and browse related papers.
- β
Paper searching via author name: Free feel to search papers of a specific author via
ctrl + F
and then type the author name. The dropdown list of authors will automatically expand when searching. - β
Multicontrol indicator: The tag
indicates "multi-control" following the title of paper.
-
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
(30 May 2025)
Yanbo Ding, Xirui Hu, Zhizhi Guo, et al.
Yanbo Ding, Xirui Hu, Zhizhi Guo, Chi Zhang, Yali Wang -
HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions
(29 May 2025)
Shuolin Xu, Siming Zheng, Ziyi Wang, et al.
Shuolin Xu, Siming Zheng, Ziyi Wang, HC Yu, Jinwei Chen, Huaqi Zhang, Bo Li, Peng-Tao Jiang -
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
(6 May 2025)
[SIGGRAPH 2025] Shiyi Zhang*, Junhao Zhuang*, et al.
Shiyi Zhang*, Junhao Zhuang*, Zhaoyang Zhang, Ying Shan, Yansong Tang -
DualReal: Joint Training for Lossless Identity-Motion Fusion in Video Customization
(4 May 2025)
Wenchuan Wang, Mengqi Huang, Yijing Tu, et al.
Wenchuan Wang, Mengqi Huang, Yijing Tu, Zhendong Mao -
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation
(23 May 2025)
Junhao Chen, Mingjin Chen, Jianjin Xu, et al.
Junhao Chen, Mingjin Chen, Jianjin Xu, Xiang Li, Junting Dong, Mingze Sun, Puhua Jiang, Hongxiang Li, Yuhang Yang, Hao Zhao, Xiaoxiao Long, Ruqi Huang -
TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation
(14 Mar 2025)
Hongxiang Zhao, Xingchen Liu, et al.
Hongxiang Zhao, Xingchen Liu, Mutian Xu, Yiming Hao, Weikai Chen, Xiaoguang Han -
Point-to-Point Video Generation (14 Mar 2025)
Tsun-Hsuan Wang, Yen-Chi Cheng, et al.
Tsun-Hsuan Wang, Yen-Chi Cheng, Chieh Hubert Lin, Hwann-Tzong Chen, Min Sun -
Pose Guided Human Video Generation (14 Mar 2025)
Ceyuan Yang, Zhe Wang, et al.
Ceyuan Yang, Zhe Wang, Xinge Zhu, Chen Huang, Jianping Shi, Dahua Lin -
SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers
(15 Feb 2025)
Di Qiu, Zhengcong Fei, et al.
Di Qiu, Zhengcong Fei, Rui Wang, Jialin Bai, Changqian Yu, Mingyuan Fan, Guibin Chen, Xiang Wen -
AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance
(12 Feb 2025)
Zhao Wang, Hao Wen, et al.
Zhao Wang, Hao Wen, Lingting Zhu, Chenming Shang, Yujiu Yang, Qi Dou -
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance
(10 Feb 2025)
Li Hu, Guangyuan Wang, et al.
Li Hu, Guangyuan Wang, Zhen Shen, Xin Gao, Dechao Meng, Lian Zhuo, Peng Zhang, Bang Zhang, Liefeng Bo -
DirectorLLM for Human-Centric Video Generation (19 Dec 2024)
Kunpeng Song, Tingbo Hou, et al.
Kunpeng Song, Tingbo Hou, Zecheng He, Haoyu Ma, Jialiang Wang, Animesh Sinha, Sam Tsai, Yaqiao Luo, Xiaoliang Dai, Li Chen, Xide Xia, Peizhao Zhang, Peter Vajda, Ahmed Elgammal, Felix Juefei-Xu -
Generative Inbetweening through Frame-wise Conditions-Driven Video Generation
(16 Dec 2024)
[CVPR 2025] Tianyi Zhu, Dongwei Ren, et al.
Tianyi Zhu, Dongwei Ren, Qilong Wang, Xiaohe Wu, Wangmeng Zuo -
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
(12 Dec 2024)
[ICLR2025] Hongxiang Li, Yaowei Li, et al.
Hongxiang Li, Yaowei Li, Yuhang Yang, Junjie Cao, Zhihong Zhu, Xuxin Cheng, Long Chen -
StableAnimator: High-Quality Identity-Preserving Human Image Animation
(26 Nov 2024)
[CVPR 2025] Shuyuan Tu, Zhen Xing, Xintong Han, et al.
Shuyuan Tu, Zhen Xing, Xintong Han, Zhi-Qi Cheng, Qi Dai, Chong Luo, Zuxuan Wu -
AnimateAnywhere: Context-Controllable Human Video Generation with ID-Consistent One-shot Learning
(28 Oct 2024)
[ACMMM 2024] Hengyuan Liu, Xiaodong Chen, Xinchen Liu, et al.
Hengyuan Liu, Xiaodong Chen, Xinchen Liu, Xiaoyan Gu, Wu Liu -
ControlNeXt: Powerful and Efficient Control for Image and Video Generation
(12 Aug 2024)
Bohao Peng, Jian Wang, et al.
Bohao Peng, Jian Wang, Yuechen Zhang, Wenbo Li, Ming-Chang Yang, Jiaya Jia -
TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
(12 Jul 2024)
Jeongho Kim, Min-Jung Kim, Junsoo Lee, et al.
Jeongho Kim, Min-Jung Kim, Junsoo Lee, Jaegul Choo -
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
(28 Jun 2024)
Yuang Zhang, Jiaxi Gu, Li-Wen Wang, et al.
Yuang Zhang, Jiaxi Gu, Li-Wen Wang, Han Wang, Junqi Cheng, Yuefeng Zhu, Fangyuan Zou -
Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control
(05 Jun 2024)
Jingyun Xue, Hongfa Wang, Qi Tian, et al.
Jingyun Xue, Hongfa Wang, Qi Tian, Yue Ma, Andong Wang, Zhiyuan Zhao, Shaobo Min, Wenzhe Zhao, Kaihao Zhang, Heung-Yeung Shum, Wei Liu, Mengyang Liu, Wenhan Luo -
UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation
(3 Jun 2024)
Xiang Wang, Shiwei Zhang, Changxin Gao, et al.
Xiang Wang, Shiwei Zhang, Changxin Gao, Jiayu Wang, Xiaoqiang Zhou, Yingya Zhang, Luxin Yan, Nong Sang -
VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
(28 May 2024)
Qilin Wang, Zhengkai Jiang, Chengming Xu, et al.
Qilin Wang, Zhengkai Jiang, Chengming Xu, Jiangning Zhang, Yabiao Wang, Xinyi Zhang, Yun Cao, Weijian Cao, Chengjie Wang, Yanwei Fu -
Disentangling Foreground and Background Motion for Enhanced Realism in Human Video Generation
(26 May 2024)
Jinlin Liu, Kai Yu, Mengyang Feng, et al.
Jinlin Liu, Kai Yu, Mengyang Feng, Xiefan Guo, Miaomiao Cui -
PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control
(23 May 2024)
[ECCV 2024] Yong Zhong, Min Zhao, Zebin You, et al.
Yong Zhong, Min Zhao, Zebin You, Xiaofeng Yu, Changwang Zhang, Chongxuan Li -
Zero-shot High-fidelity and Pose-controllable Character Animation
(21 Apr 2024)
[IJCAI 2024] Bingwen Zhu, Fanyi Wang, Tianyi Lu, et al.
Bingwen Zhu, Fanyi Wang, Tianyi Lu, Peng Liu, Jingwen Su, Jinxiu Liu, Yanhao Zhang, Zuxuan Wu, Guo-Jun Qi, Yu-Gang Jiang -
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
(21 Mar 2024)
[ECCV 2024] Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, et al.
Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Qingkun Su, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu -
Do You Guys Want to Dance: Zero-Shot Compositional Human Dance Generation with Multiple Persons
(24 Jan 2024)
Zhe Xu, Kun Wei, et al.
Zhe Xu, Kun Wei, Xu Yang, Cheng Deng -
DreaMoving: A Human Video Generation Framework based on Diffusion Models
(8 Dec 2023)
Mengyang Feng, Jinlin Liu, Kai Yu, et al.
Mengyang Feng, Jinlin Liu, Kai Yu, Yuan Yao, Zheng Hui, Xiefan Guo, Xianhui Lin, Haolan Xue, Chen Shi, Xiaowen Li, Aojie Li, Xiaoyang Kang, Biwen Lei, Miaomiao Cui, Peiran Ren, Xuansong Xie -
Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models
(3 Dec 2023)
[CVPR 2024] Shengqu Cai, Duygu Ceylan, et al.
Shengqu Cai, Duygu Ceylan, Matheus Gadelha, Chun-Hao Paul Huang, Tuanfeng Yang Wang, Gordon Wetzstein -
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
(28 Nov 2023)
[CVPR 2024] Li Hu, Xin Gao, Peng Zhang, et al.
Li Hu, Xin Gao, Peng Zhang, Ke Sun, Bang Zhang, Liefeng Bo -
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
(27 Nov 2023)
[CVPR 2024] Zhongcong Xu, Jianfeng Zhang, et al.
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou -
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
(18 Nov 2023)
[ICML 2024] Di Chang, Yichun Shi, Quankai Gao, et al.
Di Chang, Yichun Shi, Quankai Gao, Jessica Fu, Hongyi Xu, Guoxian Song, Qing Yan, Yizhe Zhu, Xiao Yang, Mohammad Soleymani -
Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model (15 Aug 2023)
Bosheng Qin, Wentao Ye, et al.
Bosheng Qin, Wentao Ye, Qifan Yu, Siliang Tang, Yueting Zhuang -
DISCO: Disentangled Control for Realistic Human Dance Generation
(30 Jun 2023)
[CVPR 2024] Tan Wang, Linjie Li, Kevin Lin, et al.
Tan Wang, Linjie Li, Kevin Lin, Yuanhao Zhai, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang -
DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion
(12 Apr 2023)
[ICCV 2023] Johanna Karras, Aleksander Holynski, Ting-Chun Wang, et al.
Johanna Karras, Aleksander Holynski, Ting-Chun Wang, Ira Kemelmacher-Shlizerman -
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos (3 Apr 2023)
[AAAI 2024] Yue Ma, Yingqing He, et al.
Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Siran Chen, Ying Shan, Xiu Li, Qifeng Chen -
vid2vid:Video-to-Video Synthesis
(3 Dec 2018)
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, et al.
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro
-
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models
(8 Jun 2025)
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, et al.
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, Jaehong Yoon, Soo Ye Kim, Zhe Lin, Sung Ju Hwang -
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding
(15 Apr 2025)
Dianbing Xi, Jiepeng Wang, Yuanzhi Liang, et al.
Xi Qiu, Yuchi Huo, Rui Wang, Chi Zhang, Xuelong Li -
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
(30 Nov 2024)
Yatian Pang, Bin Zhu, Bin Lin, et al.
Mingzhe Zheng, Francis E. H. Tay, Ser-Nam Lim, Harry Yang, Li Yuan -
ControlNeXt: Powerful and Efficient Control for Image and Video Generation
(12 Aug 2024)
Bohao Peng, Jian Wang, Yuechen Zhang, et al.
Wenbo Li, Ming-Chang Yang, Jiaya Jia -
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
(21 Mar 2024)
[ECCV 2024] Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, et al.
Qingkun Su, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu -
MoonShot: Towards Controllable Video Generation and Editing with Multimodal Conditions
(3 Jan 2024)
David Junhao Zhang, Dongxu Li, Hung Le, et al.
Mike Zheng Shou, Caiming Xiong, Doyen Sahoo -
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models (28 Nov 2023)
[ECCV 2024] Yuwei Guo, Ceyuan Yang, Anyi Rao, et al.
Maneesh Agrawala, Dahua Lin, Bo Dai -
GD-VDM: Generated Depth for better Diffusion-based Video Generation (19 Jun 2023)
Ariel Lapid, Idan Achituve, Lior Bracha, et al.
Ethan Fetaya -
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
(1 Jun 2023)
[IEEE TVCG 2024] Jinbo Xing, Menghan Xia, Yuxin Liu, et al.
Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong -
Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learning
(23 May 2023)
Weifeng Chen, Yatai Ji, Jie Wu, et al.
Hefeng Wu, Pan Xie, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin -
ControlVideo: Training-free Controllable Text-to-Video Generation (22 May 2023)
[ICLR 2024] Yabo Zhang, Yuxiang Wei, Dongsheng Jiang, et al.
Xiaopeng Zhang, Wangmeng Zuo, Qi Tian
-
HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation
(24 Mar 2024)
[CVPR 2025] Zunnan Xu, Zhentao Yu, et al.
Zunnan Xu, Zhentao Yu, Zixiang Zhou, Jun Zhou, Xiaoyu Jin, Fa-Ting Hong, Xiaozhong Ji, Junwei Zhu, Chengfei Cai, Shiyu Tang, Qin Lin, Xiu Li, Qinglin Lu -
TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation
(14 Mar 2025)
Hongxiang Zhao, Xingchen Liu, et al.
Hongxiang Zhao, Xingchen Liu, Mutian Xu, Yiming Hao, Weikai Chen, Xiaoguang Han -
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
(26 Nov 2024)
[CVPR 2025] Shenghai Yuan, Jinfa Huang, et al.
Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan -
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
(15 Nov 2024)
[CVPR 2025] Rang Meng, Xingyu Zhang, Yuming Li, et al.
Chenguang Ma -
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization
(18 Oct 2024)
Bin Lin, Yanzhen Yu, et al.
Bin Lin, Yanzhen Yu, Jianhao Ye, Ruitao Lv, Yuguang Yang, Ruoye Xie, Pan Yu, Hongbin Zhou -
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions
(12 Jul 2024)
[AAAI 2025] Zhiyuan Chen, Jiajiong Cao, Zhiquan Chen, et al.
Yuming Li, Chenguang Ma -
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
(3 Jul 2024)
Jianzhu Guo, Dingyun Zhang, et al.
Jianzhu Guo, Dingyun Zhang, Xiaoqiang Liu, Zhizhou Zhong, Yuan Zhang, Pengfei Wan, Di Zhang -
Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation
(4 Jun 2024)
[Siggraph Asia 2024] Yue Ma, Hongyu Liu, et al.
Yue Ma, Hongyu Liu, Hongfa Wang, Heng Pan, Yingqing He, Junkun Yuan, Ailing Zeng, Chengfei Cai, Heung-Yeung Shum, Wei Liu, Qifeng Chen -
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos (3 Apr 2023)
[AAAI 2024] Yue Ma, Yingqing He, et al.
Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Siran Chen, Ying Shan, Xiu Li, Qifeng Chen
-
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models
(8 Jun 2025)
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, et al.
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, Jaehong Yoon, Soo Ye Kim, Zhe Lin, Sung Ju Hwang -
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding (15 Apr 2025)
Dianbing Xi, Jiepeng Wang, Yuanzhi Liang, et al.
Xi Qiu, Yuchi Huo, Rui Wang, Chi Zhang, Xuelong Li -
SketchVideo: Sketch-based Video Generation and Editing
(30 Mar 2025)
Feng-Lin Liu, Hongbo Fu, Xintao Wang, et al.
Feng-Lin Liu, Hongbo Fu, Xintao Wang, Weicai Ye, Pengfei Wan, Di Zhang, Lin Gao -
LayerAnimate: Layer-level Control for Animation
(22 Mar 2025)
Yuxue Yang, Lue Fan, Zuzeng Lin, et al.
Feng Wang, Zhaoxiang Zhang -
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
(13 Mar 2025)
Hao He, Ceyuan Yang, Shanchuan Lin, et al.
Yinghao Xu, Meng Wei, Liangke Gui, Qi Zhao, Gordon Wetzstein, Lu Jiang, Hongsheng Li -
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
(13 Mar 2025)
[ICLR 2025] Hao He, Yinghao Xu, Yuwei Guo, et al.
Gordon Wetzstein, Bo Dai, Hongsheng Li, Ceyuan Yang -
VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control
(17 Feb 2025)
Lifan Jiang, Shuang Chen, Boxi Wu, et al.
Xiaotong Guan, Jiahui Zhang -
AniDoc: Animation Creation Made Easier
(30 Jan 2025)
[CVPR 2025] Yihao Meng, Hao Ouyang, Hanlin Wang, et al.
Qiuyu Wang, Wen Wang, Ka Leong Cheng, Zhiheng Liu, Yujun Shen, Huamin Qu -
Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training
(8 Dec 2024)
Zhenghong Zhou, Jie An, Jiebo Luo, et al.
-
CamI2V: Camera-Controlled Image-to-Video Diffusion Model
(4 Dec 2024)
Guangcong Zheng, Teng Li, Rui Jiang, et al.
Yehao Lu, Tao Wu, Xi Li -
Motion Prompting: Controlling Video Generation with Motion Trajectories(frame+track+text)
(3 Dec 2024)
Daniel Geng, Charles Herrmann, Junhwa Hur, et al.
Forrester Cole, Serena Zhang, Tobias Pfaff, Tatiana Lopez-Guevara, Carl Doersch, Yusuf Aytar, Michael Rubinstein, Chen Sun, Oliver Wang, Andrew Owens, Deqing Sun -
MOTIONFLOW: Learning Implicit Motion Flow for Complex Camera Trajectory Control in Video Generation
(Dec 2024)
Author list not fully provided
-
I2VControl: Disentangled and Unified Video Motion Synthesis Control
(30 Nov 2024)
Zhiyuan Zhang, Dongdong Chen, Jing Liao
-
Trajectory Attention: Enhancing Video Generation with Fine-Grained Motion Control
(28 Nov 2024)
Zeqi Xiao, Wenqi Ouyang, Yifan Zhou, et al.
Shuai Yang, Lei Yang, Jianlou Si, Xingang Pan -
Open-Sora Plan: Open-Source Large Video Generation Model
(28 Nov 2024)
Bin Lin, Yunyang Ge, Xinhua Cheng, et al.
Zongjian Li, Bin Zhu, Shaodong Wang, Xianyi He, Yang Ye, Shenghai Yuan, Liuhan Chen, Tanghui Jia, Junwu Zhang, Zhenyu Tang, Yatian Pang, Bin She, Cen Yan, Zhiheng Hu, Xiaoyi Dong, Lin Chen, Zhang Pan, Xing Zhou, Shaoling Dong, Yonghong Tian, Li Yuan -
ToonCrafter: Generative Cartoon Interpolation
(19 Nov 2024)
[ACM TOG] Jinbo Xing, Hanyuan Liu, Menghan Xia, et al.
Yong Zhang, Xintao Wang, Ying Shan, Tien-Tsin Wong -
MagicStick: Controllable Video Editing via Control Handle Transformations
(18 Nov 2024)
[WACV 2025] Yue Ma, Xiaodong Cun, Sen Liang, et al.
Jinbo Xing, Yingqing He, Chenyang Qi, Siran Chen, Qifeng Chen -
Video Diffusion Models are Training-free Motion Interpreter and Controller
(12 Nov 2024)
Zeqi Xiao, Yifan Zhou, Shuai Yang, Xingang Pan, et al.
-
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
(7 Nov 2024)
Wenqiang Sun, Shuo Chen, Fangfu Liu, et al.
Zilong Chen, Yueqi Duan, Jun Zhang, Yikai Wang -
MotionClone: Training-Free Motion Cloning for Controllable Video Generation
(22 Oct 2024)
Pengyang Ling, Jiazi Bu, Pan Zhang, et al.
Xiaoyi Dong, Yuhang Zang, Tong Wu, Huaian Chen, Jiaqi Wang, Yi Jin -
Boosting Camera Motion Control for Video Diffusion Transformers
(14 Oct 2024)
Soon Yau Cheong, Duygu Ceylan, Armin Mustafa, et al.
Andrew Gilbert, Chun-Hao Paul Huang -
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
(14 Oct 2024)
Dejia Xu, Yifan Jiang, Chen Huang, et al.
Liangchen Song, Thorsten Gernoth, Liangliang Cao, Zhangyang Wang, Hao Tang -
LVCD: Reference-based Lineart Video Colorization with Diffusion Models
(19 Sep 2024)
Zhitong Huang, Mohan Zhang, Jing Liao
-
EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation
(16 Sep 2024)
Cong Wang, Jiaxi Gu, Panwen Hu, et al.
Haoyu Zhao, Yuanfan Guo, Jianhua Han, Hang Xu, Xiaodan Liang -
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
(3 Sep 2024)
Wangbo Yu, Jinbo Xing, Li Yuan, et al.
Wenbo Hu, Xiaoyu Li, Zhipeng Huang, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian -
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
(17 Jul 2024)
Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, et al.
Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, Willi Menapace, Guocheng Qian, Michael Vasilkovsky, Hsin-Ying Lee, Chaoyang Wang, Jiaxu Zou, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov -
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
(16 Jul 2024)
Zhouxia Wang, Ziyang Yuan, Xintao Wang, et al.
Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, Ying Shan -
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
(24 May 2024)
[ICLR 2025] Han Lin, Jaemin Cho, Abhay Zala, et al.
Mohit Bansal -
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos (25 Dec 2023)
[CVPR 2024] Xiang Wang, Shiwei Zhang, Hangjie Yuan, et al.
Zhiwu Qing, Biao Gong, Yingya Zhang, Yujun Shen, Changxin Gao, Nong Sang -
VideoLCM: Video Latent Consistency Model
(14 Dec 2023)
Xiang Wang, Shiwei Zhang, Han Zhang, et al.
Yu Liu, Yingya Zhang, Changxin Gao, Nong Sang -
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
(28 Nov 2023)
Yuwei Guo, Ceyuan Yang, Anyi Rao, et al.
Maneesh Agrawala, Dahua Lin, Bo Dai -
Make Pixels Dance: High-Dynamic Video Generation (18 Nov 2023)
[CVPR 2024] Yan Zeng, Guoqiang Wei, Jiani Zheng, et al.
Jiaxin Zou, Yang Wei, Yuchen Zhang, Hang Li -
TaleCrafter: Interactive Story Visualization with Multiple Characters
(30 May 2023)
[SIGGRAPH Asia 2023] Yuan Gong, Youxin Pang, Xiaodong Cun, et al.
Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang -
Sketching the Future(STF): Applying Conditional Control Techniques to Text-to-Video Models
(10 May 2023)
Rohan Dhesikan, Vignesh Rajmohan
-
SketchBetween: Video-to-Video Synthesis for Sprite Animation via Sketches (1 Sep 2022)
Dagmar Lukka LoftsdΓ³ttir, Matthew Guzdial
-
Sketch Me A Video (10 Oct 2021)
Haichao Zhang, Gang Yu, Tao Chen, et al.
Haichao Zhang, Gang Yu, Tao Chen, Guozhong Luo -
vid2vid:Video-to-Video Synthesis
(3 Dec 2018)
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, et al.
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro
-
HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving
(02 Dec 2024)
Zehuan Wu, Jingcheng Ni, Xiaodong Wang, et al.
Zehuan Wu, Jingcheng Ni, Xiaodong Wang, Yuxin Guo, Rui Chen, Lewei Lu, Jifeng Dai, Yuwen Xiong
-
Higher fidelity autonomous vehicle video generation with bounding-box controlled object motion
(8 Dec 2024 )
Ge Ya Luo, Zhi Hao Luo, et al.
Ge Ya Luo, Zhi Hao Luo, Anthony Gosselin, Alexia Jolicoeur-Martineau, Christopher Pal -
MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control
(21 Nov 2024)
Ruiyuan Gao, Kai Chen, Bo Xiao, et al.
Ruiyuan Gao, Kai Chen, Bo Xiao, Lanqing Hong, Zhenguo Li, Qiang Xu -
DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes
(06 Sep 2024)
Jianbiao Mei, Tao Hu, Xuemeng Yang, et al.
Jianbiao Mei, Tao Hu, Xuemeng Yang, Licheng Wen, Yu Yang, Tiantian Wei, Yukai Ma, Min Dou, Botian Shi, Yong Liu -
DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation
(09 Sep 2024)
Wei Wu, Xi Guo, Weixuan Tang, et al.
Wei Wu, Xi Guo, Weixuan Tang, Tingxuan Huang, Chiyu Wang, Dongyue Chen, Chenjing Ding -
DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation
(09 Sep 2024)
Wei Wu, Xi Guo, Weixuan Tang, et al.
Wei Wu, Xi Guo, Weixuan Tang, Tingxuan Huang, Chiyu Wang, Dongyue Chen, Chenjing Ding -
DiVE: DiT-based Video Generation with Enhanced Control
(03 Sep 2024)
Junpeng Jiang, Gangyi Hong, Lijun Zhou, et al.
Junpeng Jiang, Gangyi Hong, Lijun Zhou, Enhui Ma, Hengtong Hu, Xia Zhou, Jie Xiang, Fan Liu, Kaicheng Yu, Haiyang Sun, Kun Zhan, Peng Jia, Miao Zhang -
Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation
(3 Jun 2024)
Enhui Ma, Lijun Zhou, Tao Tang, et al.
Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu
-
Boximator: Generating Rich and Controllable Motions for Video Synthesis
(2 Feb 2024)
Jiawei Wang, Yuchen Zhang, Jiaxin Zou, et al.
Yan Zeng, Guoqiang Wei, Liping Yuan, Hang Li
-
Panacea: Panoramic and Controllable Video Generation for Autonomous Driving
(28 Nov 2023)
Yuqing Wen, Yucheng Zhao, Yingfei Liu, et al.
Yuqing Wen, Yucheng Zhao, Yingfei Liu, Fan Jia, Yanhui Wang, Chong Luo, Chi Zhang, Tiancai Wang, Xiaoyan Sun, Xiangyu Zhang -
MagicDrive: Street View Generation with Diverse 3D Geometry Control
(04 Oct 2023)
Ruiyuan Gao, Kai Chen, Enze Xie, et al.
Ruiyuan Gao, Kai Chen, Enze Xie, Lanqing Hong, Zhenguo Li, Dit-Yan Yeung, Qiang Xu -
DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving
(18 Sep 2023)
Xiaofeng Wang, Zheng Zhu, Guan Huang, et al.
Xiaofeng Wang, Zheng Zhu, Guan Huang, Xinze Chen, Jiagang Zhu, Jiwen Lu -
LLM-grounded Video Diffusion Models (29 Sep 2023)
Long Lian, Baifeng Shi, Adam Yala, et al.
Long Lian, Baifeng Shi, Adam Yala, Trevor Darrell, Boyi Li -
Multi-object Video Generation from Single Frame Layouts
(06 May 2023)
Yang Wu, Zhibin Liu, Hefeng Wu, et al.
Yang Wu, Zhibin Liu, Hefeng Wu, Liang Lin
-
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation
(23 May 2025)
Junhao Chen, Mingjin Chen, Jianjin Xu, et al.
Junhao Chen, Mingjin Chen, Jianjin Xu, Xiang Li, Junting Dong, Mingze Sun, Puhua Jiang, Hongxiang Li, Yuhang Yang, Hao Zhao, Xiaoxiao Long, Ruqi Huang
-
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
(30 May 2025)
Yanbo Ding, Xirui Hu, Zhizhi Guo, et al.
Yanbo Ding, Xirui Hu, Zhizhi Guo, Chi Zhang, Yali Wang
-
SkyReels-A2: Compose Anything in Video Diffusion Transformers
(3 Apr 2025)
Zhengcong Fei, Debang Li, Di Qiu, et al.
Zhengcong Fei, Debang Li, Di Qiu, Jiahua Wang, Yikun Dou, Rui Wang, Jingtao Xu, Mingyuan Fan, Guibin Chen, Yang Li, Yahui Zhou
-
AnimateAnywhere: Rouse the Background in Human Image Animation
(28 Apr 2025)
Xiaoyu Liu, Mingshuai Yao, Yabo Zhang, et al.
Xiaoyu Liu, Mingshuai Yao, Yabo Zhang, Xianhui Lin, Peiran Ren, Xiaoming Li, Ming Liu, Wangmeng Zuo -
Concat-ID: Towards Universal Identity-Preserving Video Synthesis
(19 Apr 2025)
Yong Zhong, Zhuoyi Yang, Jiayan Teng, et al.
Yong Zhong, Zhuoyi Yang, Jiayan Teng, Xiaotao Gu, Chongxuan Li
-
PERSONALVIDEO: High ID-Fidelity Video Customization with Static Images (16 Mar 2025)
Hengjia Li, Haonan Qiu, Shiwei Zhang, Xiang Wang, et al.
Hengjia Li, Haonan Qiu, Shiwei Zhang, Xiang Wang, Yujie Wei, Zekun Li, Yingya Zhang, Boxi Wu, Deng Cai -
AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance
(12 Feb 2025)
Zhao Wang, Hao Wen, Lingting Zhu, et al.
Zhao Wang, Hao Wen, Lingting Zhu, Chenming Shang, Yujiu Yang, Qi Dou -
Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts (4 Feb 2025)
Feng Liang, Haoyu Ma, Zecheng He, et al.
Feng Liang, Haoyu Ma, Zecheng He, Tingbo Hou, Ji Hou, Kunpeng Li, Xiaoliang Dai, Felix Juefei-Xu, Samaneh Azadi, Animesh Sinha, Peizhao Zhang, Peter Vajda, Diana Marculescu -
EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion (23 Jan 2025)
Jiangchuan Wei, Shiyue Yan, Wenfeng Lin, et al.
Jiangchuan Wei, Shiyue Yan, Wenfeng Lin, Boyuan Liu, Renjie Chen, Mingyu Guo
-
VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention (03 Dec 2024)
Mingzhe Zheng, Yongqi Xu, Haojian Huang, et al.
Mingzhe Zheng, Yongqi Xu, Haojian Huang, Xuran Ma, Yexin Liu, Wenjie Shu, Yatian Pang, Feilong Tang, Qifeng Chen, Harry Yang, Ser-Nam Lim -
Identity-Preserving Text-to-Video Generation by Frequency Decomposition (25 Nov 2024)
Shenghai Yuan, Jinfa Huang, et al.
Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan -
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling
(24 Sep 2024)
Yifang Men, Yuan Yao, Miaomiao Cui, et al.
Yifang Men, Yuan Yao, Miaomiao Cui, Liefeng Bo
-
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation (25 Jun 2024)
Xuanhua He, Quande Liu, et al.
Xuanhua He, Quande Liu, Shengju Qian, Xin Wang, Tao Hu, Ke Cao, Keyu Yan, Jie Zhang -
Magic-Me: Identity-Specific Video Customized Diffusion (14 Feb 2024)
Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, et al.
Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li, Huanrui Yang, Zhen Dong, Kurt Keutzer, Jiashi Feng -
Vlogger: Make Your Dream A Vlog (17 Jan 2024)
Shaobin Zhuang, Kunchang Li, Xinyuan Chen, et al.
Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang
-
DualReal: Joint Training for Lossless Identity-Motion Fusion in Video Customization
(4 May 2025)
Wenchuan Wang, Mengqi Huang, Yijing Tu, et al.
Wenchuan Wang, Mengqi Huang, Yijing Tu, Zhendong Mao -
Concat-ID: Towards Universal Identity-Preserving Video Synthesis
(19 Apr 2025)
Yong Zhong, Zhuoyi Yang, Jiayan Teng, et al.
Yong Zhong, Zhuoyi Yang, Jiayan Teng, Xiaotao Gu, Chongxuan Li -
VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models
(27 Mar 2025)
Chi-Pin Huang, Yen-Siang Wu, Hung-Kai Chung, et al.
Chi-Pin Huang, Yen-Siang Wu, Hung-Kai Chung, Kai-Po Chang, Fu-En Yang, Yu-Chiang Frank Wang
-
DreamRelation: Relation-Centric Video Customization (10 Mar 2025)
Yujie Wei, Shiwei Zhang, Hangjie Yuan, et al.
Yujie Wei, Shiwei Zhang, Hangjie Yuan, Biao Gong, Longxiang Tang, Xiang Wang, Haonan Qiu, Hengjia Li, Shuai Tan, Yingya Zhang, Hongming Shan -
Get In Video: Add Anything You Want to the Video (8 Mar 2025)
Shaobin Zhuang, Zhipeng Huang, Binxin Yang, et al.
Shaobin Zhuang, Zhipeng Huang, Binxin Yang, Ying Zhang, Fangyikang Wang, Canmiao Fu, Chong Sun, Zheng-Jun Zha, Chen Li, Yali Wang -
Phantom: Subject-consistent Video Generation via Cross-modal Alignment (16 Feb 2025)
Lijie Liu, Tianxiang Ma, Bingchuan Li, et al.
Lijie Liu, Tianxiang Ma, Bingchuan Li, Zhuowei Chen, Jiawei Liu, Qian He, Xinglong Wu -
Multi-subject Open-set Personalization in Video Generation (10 Jan 2025)
Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, et al.
Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Yuwei Fang, Kwot Sin Lee, Ivan Skorokhodov, Kfir Aberman, Jun-Yan Zhu, Ming-Hsuan Yang, Sergey Tulyakov -
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning (8 Jan 2025)
Yuzhou Huang, Ziyang Yuan, Quande Liu, et al.
Yuzhou Huang, Ziyang Yuan, Quande Liu, Qiulin Wang, Xintao Wang, Ruimao Zhang, Pengfei Wan, Di Zhang, Kun Gai -
VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models (27 Dec 2024)
Tao Wu, Yong Zhang, Xiaodong Cun, et al.
Tao Wu, Yong Zhang, Xiaodong Cun, Zhongang Qi, Junfu Pu, Huanzhang Dou, Guangcong Zheng, Ying Shan, Xi Li -
Customcrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities (27 Dec 2024)
Tao Wu, Yong Zhang, Xintao Wang, et al.
Tao Wu, Yong Zhang, Xintao Wang, Xianpan Zhou, Guangcong Zheng, Zhongang Qi, Ying Shan, Xi Li -
CustomTTT: Motion and Appearance Customized Video Generation via Test-Time Training (20 Dec 2024)
Xiuli Bi, Jian Lu, Bo Liu, et al.
Xiuli Bi, Jian Lu, Bo Liu, Xiaodong Cun, Yong Zhang, Weisheng Li, Bin Xiao -
SUGAR: Subject-Driven Video Customization in a Zero-Shot Manner (13 Dec 2024)
Yufan Zhou, Ruiyi Zhang, Jiuxiang Gu, et al.
Yufan Zhou, Ruiyi Zhang, Jiuxiang Gu, Nanxuan Zhao, Jing Shi, Tong Sun -
DreamRunner: Fine-Grained Compositional Story-to-Video Generation with Retrieval-Augmented Motion Adaptation (25 Nov 2024)
Zun Wang, Jialu Li, Han Lin, et al.
Zun Wang, Jialu Li, Han Lin, Jaehong Yoon, Mohit Bansal -
VideoAlchemy: Open-set Personalization in Video Generation (15 Nov 2024)
Tsai-Shien Chen, Aliaksandr Siarohin, et al.
Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Yuwei Fang, Ivan Skorokhodov, Jun-Yan Zhu, Kfir Aberman, Ming-Hsuan Yang, Sergey Tulyakov -
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration (7 Nov 2024)
Panwen Hu, Jin Jiang, Jianqi Chen, et al.
Panwen Hu, Jin Jiang, Jianqi Chen, Mingfei Han, Shengcai Liao, Xiaojun Chang, Xiaodan Liang -
MotionBooth: Motion-Aware Customized Text-to-Video Generation (29 Oct 2024)
Jianzong Wu, Xiangtai Li, Yanhong Zeng, et al.
Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, Yining Li, Yunhai Tong, Kai Chen -
Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation (19 Aug 2024)
Yunxin Li, Haoyuan Shi, Baotian Hu, et al.
Yunxin Li, Haoyuan Shi, Baotian Hu, Longyue Wang, Jiashun Zhu, Jinyi Xu, Zhen Zhao, Min Zhang -
Still-Moving: Customized Video Generation Without Customized Video Data (11 Jul 2024)
Hila Chefer, Shiran Zada, Roni Paiss, Ariel Ephrat, et al.
Hila Chefer, Shiran Zada, Roni Paiss, Ariel Ephrat, Omer Tov, Michael Rubinstein, Lior Wolf, Tali Dekel, Tomer Michaeli, Inbar Mosseri -
VIMI: Grounding Video Generation through Multi-modal Instruction (8 Jul 2024)
Yuwei Fang, Willi Menapace, Aliaksandr Siarohin, et al.
Yuwei Fang, Willi Menapace, Aliaksandr Siarohin, Tsai-Shien Chen, Kuan-Chien Wang, Ivan Skorokhodov, Graham Neubig, Sergey Tulyakov -
Customvideo: Customizing Text-to-Video Generation with Multiple Subjects (22 May 2024)
Zhao Wang, Aoxue Li, et al.
Zhao Wang, Aoxue Li, Lingting Zhu, Yong Guo, Qi Dou, Zhenguo Li -
DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control (21 May 2024)
Hong Chen, Xin Wang, Yipeng Zhang, et al.
Hong Chen, Xin Wang, Yipeng Zhang, Yuwei Zhou, Zeyang Zhang, Siao Tang, Wenwu Zhu -
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers (9 May 2024)
Peng Gao, Le Zhuo, Dongyang Liu, et al.
Peng Gao, Le Zhuo, Dongyang Liu, Ruoyi Du, Xu Luo, Longtian Qiu, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, He Tong, Jingwen He, Yu Qiao, Hongsheng Li -
SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control (28 Mar 2024)
Zheng Chen, Di Qiu, Rui Wang, et al.
Zheng Chen, Di Qiu, Rui Wang, Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, Zhenzhong Chen, Xiangyu Zhang -
AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production (12 Mar 2024)
Jiuniu Wang, Zehua Du, Yuyuan Zhao, et al.
Jiuniu Wang, Zehua Du, Yuyuan Zhao, Bo Yuan, Kexiang Wang, Jian Liang, Yaxi Zhao, Yihen Lu, Gengliang Li, Junlong Gao, Xin Tu, Zhenyu Guo -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM (2 Jan 2024)
Fuchen Long, Zhaofan Qiu, Ting Yao, et al.
Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei -
Dreamvideo: Composing Your Dream Videos with Customized Subject and Motion (7 Dec 2023)
Yujie Wei, Shiwei Zhang, Zhiwu Qing, et al.
Yujie Wei, Shiwei Zhang, Zhiwu Qing, Hangjie Yuan, Zhiheng Liu, Yu Liu, Yingya Zhang, Jingren Zhou, Hongming Shan -
VideoBooth: Diffusion-based Video Generation with Image Prompts (1 Dec 2023)
Yuming Jiang, Tianxing Wu, Shuai Yang, et al.
Yuming Jiang, Tianxing Wu, Shuai Yang, Chenyang Si, Dahua Lin, Yu Qiao, Chen Change Loy, Ziwei Liu -
Videodreamer: Customized Multi-Subject Text-to-Video Generation with Disen-Mix Finetuning (2 Nov 2023)
Hong Chen, Xin Wang, Guanning Zeng, et al.
Hong Chen, Xin Wang, Guanning Zeng, Yipeng Zhang, Yuwei Zhou, Feilin Han, Wenwu Zhu -
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation (13 Jul 2023)
Yingqing He, Menghan Xia, Haoxin Chen, et al.
Yingqing He, Menghan Xia, Haoxin Chen, Xiaodong Cun, Yuan Gong, Jinbo Xing, Yong Zhang, Xintao Wang, Chao Weng, Ying Shan, Qifeng Chen -
TaleCrafter: Interactive Story Visualization with Multiple Characters (30 May 2023)
Yuan Gong, Youxin Pang, et al.
Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang -
Dreamix: Video Diffusion Models are General Video Editors (2 Feb 2023)
Eyal Molad, Eliahu Horwitz, et al.
Eyal Molad, Eliahu Horwitz, Dani Valevski, Alex Rav Acha, Yossi Matias, Yael Pritch, Yaniv Leviathan, Yedid Hoshen
-
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models
(8 Jun 2025)
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, et al.
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, Jaehong Yoon, Soo Ye Kim, Zhe Lin, Sung Ju Hwang
-
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation
(23 May 2025)
Junhao Chen, Mingjin Chen, Jianjin Xu, et al.
Junhao Chen, Mingjin Chen, Jianjin Xu, Xiang Li, Junting Dong, Mingze Sun, Puhua Jiang, Hongxiang Li, Yuhang Yang, Hao Zhao, Xiaoxiao Long, Ruqi Huang
-
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding (15 Apr 2025)
Dianbing Xi, Jiepeng Wang, Yuanzhi Liang, et al.
Xi Qiu, Yuchi Huo, Rui Wang, Chi Zhang, Xuelong Li -
HunyuanVideo: A Systematic Framework For Large Video Generative Models
(11 Mar 2025)
Weijie Kong, Qi Tian, Zijian Zhang, et al.
Rox Min, Zuozhuo Dai, Jin Zhou, Jiangfeng Xiong, Xin Li, Bo Wu, Jianwei Zhang, Kathrina Wu, Qin Lin, Junkun Yuan, Yanxin Long, Aladdin Wang, Andong Wang, Changlin Li, Duojun Huang, Fang Yang, Hao Tan, Hongmei Wang, Jacob Song, Jiawang Bai, Jianbing Wu, Jinbao Xue, Joey Wang, Kai Wang, Mengyang Liu, Pengyu Li, Shuai Li, Weiyan Wang, Wenqing Yu, Xinchi Deng, Yang Li, Yi Chen, Yutao Cui, Yuanbo Peng, Zhentao Yu, Zhiyu He, Zhiyong Xu, Zixiang Zhou, Zunnan Xu, Yangyu Tao, Qinglin Lu, Songtao Liu, Dax Zhou, Hongfa Wang, Yong Yang, Di Wang, Yuhong Liu, Jie Jiang, Caesar Zhong -
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think (2 Mar 2025)
[CVPR 2025] Jie Tian, Xiaoye Qu, Zhenyi Lu, et al.
Wei Wei, Sichen Liu, Yu Cheng -
Adapting Image-to-Video Diffusion Models for Large-Motion Frame Interpolation (17 Feb 2025)
Luoxu Jin, Hiroshi Watanabe
-
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation (12 Feb 2025)
[ICLR 2025] Xiaojuan Wang, Boyang Zhou, Brian Curless, et al.
Ira Kemelmacher-Shlizerman, Aleksander Holynski, Steven M. Seitz -
Autoregressive Video Generation without Vector Quantization (9 Jan 2025)
[ICLR 2025] Haoge Deng, Ting Pan, Haiwen Diao, et al.
Zhengxiong Luo, Yufeng Cui, Huchuan Lu, Shiguang Shan, Yonggang Qi, Xinlong Wang -
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation (6 Jan 2025)
[CVPR 2025] Guy Yariv, Yuval Kirstain, Amit Zohar, et al.
Shelly Sheynin, Yaniv Taigman, Yossi Adi, Sagie Benaim, Adam Polyak -
STIV: Scalable Text and Image Conditioned Video Generation (10 Dec 2024)
Zongyu Lin, Wei Liu, Chen Chen, et al.
Jiasen Lu, Wenze Hu, Tsu-Jui Fu, Jesse Allardice, Zhengfeng Lai, Liangchen Song, Bowen Zhang, Cha Chen, Yiran Fei, Yifan Jiang, Lezhi Li, Yizhou Sun, Kai-Wei Chang, Yinfei Yang -
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation (8 Dec 2024)
Shuwei Shi, Biao Gong, Xi Chen, et al.
Dandan Zheng, Shuai Tan, Zizheng Yang, Yuyuan Li, Jingwen He, Kecheng Zheng, Jingdong Chen, Ming Yang, Yinqiang Zheng -
Lumiere: A Space-Time Diffusion Model for Video Generation
(3 Dec 2024)
[SIGGRAPH Asia 2024] Omer Bar-Tal, Hila Chefer, Omer Tov, et al.
Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Guanghui Liu, Amit Raj, Yuanzhen Li, Michael Rubinstein, Tomer Michaeli, Oliver Wang, Deqing Sun, Tali Dekel, Inbar Mosseri -
Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model (6 Nov 2024)
[NeurIPS 2024] Min Zhao, Hongzhou Zhu, Chendong Xiang, et al.
Kaiwen Zheng, Chongxuan Li, Jun Zhu -
FrameBridge: Improving Image-to-Video Generation with Bridge Models (20 Oct 2024)
Yuji Wang, Zehua Chen, Xiaoyu Chen, et al.
Jun Zhu, Jianfei Chen -
I4VGen: Image as Free Stepping Stone for Text-to-Video Generation
(3 Oct 2024)
Xiefan Guo, Jinlin Liu, Miaomiao Cui, et al.
Liefeng Bo, Di Huang -
DynamiCrafter: Animating Open-Domain Images with Video Diffusion Priors (1 Oct 2024)
[ECCV 2024] Jinbo Xing, Menghan Xia, Yong Zhang, et al.
Haoxin Chen, Wangbo Yu, Hanyuan Liu, Gongye Liu, Xintao Wang, Ying Shan, Tien-Tsin Wong -
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation (27 Sep 2024)
[ECCV 2024] Shaowei Liu, Zhongzheng Ren, Saurabh Gupta, et al.
Shenlong Wang -
Structure and Content-Guided Video Synthesis with Diffusion Models (27 Sep 2024)
[ICCV 2023] Patrick Esser, Johnathan Chiu, Parmida Atighehchian, et al.
Jonathan Granskog, Anastasis Germanidis -
DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance (16 Sep 2024)
[ICASSP 2025] Cong Wang, Jiaxi Gu, Panwen Hu, et al.
Songcen Xu, Hang Xu, Xiaodan Liang -
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning (2 Aug 2024)
[ECCV 2024] Rohit Girdhar, Mannat Singh, Andrew Brown, et al.
Quentin Duval, Samaneh Azadi, Sai Saketh Rambhatla, Akbar Shah, Xi Yin, Devi Parikh, Ishan Misra -
MoVideo: Motion-Aware Video Generation with Diffusion Model (29 Jul 2024)
[ECCV 2024] Jingyun Liang, Yuchen Fan, Kai Zhang, et al.
Radu Timofte, Luc Van Gool, Rakesh Ranjan -
I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models (13 Jul 2024)
[ACM SIGGRAPH 2024] Xun Guo1, Mingwu Zheng, Liang Hou, et al.
Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, Zhengjun Zha, Haibin Huang, Chongyang Ma -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture (5 Jul 2024)
Jiaqi Xu, Xinyi Zou, Kunzhe Huang, et al.
Yunkuo Chen, Bo Liu, MengLi Cheng, Xing Shi, Jun Huang -
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (1 Jul 2024)
[TMLR 2024] Weiming Ren, Huan Yang, Ge Zhang, et al.
Cong Wei, Xinrun Du, Wenhao Huang, Wenhu Chen -
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation (13 Jun 2024)
[NeurIPS 2024] Junke Wang, Yi Jiang, Zehuan Yuan, et al.
Binyue Peng, Zuxuan Wu, Yu-Gang Jiang -
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction (10 Jun 2024)
Zhen Xing, Qi Dai, Zejia Weng, et al.
Zuxuan Wu, Yu-Gang Jiang -
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models (25 Apr 2024)
[CVPR 2024] Haomiao Ni, Bernhard Egger, Suhas Lohit, et al.
Anoop Cherian, Ye Wang, Toshiaki Koike-Akino, Sharon X. Huang, Tim K. Marks -
TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models (25 Mar 2024)
[CVPR 2024] Zhongwei Zhang, Fuchen Long, Yingwei Pan, et al.
Zhaofan Qiu, Ting Yao, Yang Cao, Tao Mei -
Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts (13 Mar 2024)
Yue Ma, Yingqing He, Hongfa Wang, et al.
Andong Wang, Chenyang Qi, Chengfei Cai, Xiu Li, Zhifeng Li, Heung-Yeung Shum, Wei Liu, Qifeng Chen -
AtomoVideo: High Fidelity Image-to-Video Generation (5 Mar 2024)
Litong Gong, Yiran Zhu, Weijie Li, et al.
Xiaoyang Kang, Biao Wang, Tiezheng Ge, Bo Zheng -
Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation (5 Mar 2024)
Weijie Li, Litong Gong, Yiran Zhu, et al.
Fanda Fan, Biao Wang, Tiezheng Ge, Bo Zheng -
Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling (31 Jan 2024)
[ACM SIGGRAPH 2024] Xiaoyu Shi, Zhaoyang Huang, Fu-Yun Wang, et al.
Weikang Bian, Dasong Li, Yi Zhang, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li -
UniVG: Towards UNIfied-modal Video Generation (17 Jan 2024)
Ludan Ruan, Lei Tian, Chuanwei Huang, et al.
Xu Zhang, Xinyan Xiao -
Decouple Content and Motion for Conditional Image-to-Video Generation (14 Dec 2023)
[AAAI 2024] Cuifeng Shen, Yulu Gan, Chen Chen, et al.
Xiongwei Zhu, Lele Cheng, Tingting Gao, Jinzhi Wang -
AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance
(4 Dec 2023)
[TMLR 2024] Zuozhuo Dai, Zhenghao Zhang, Yao Yao, et al.
Bingxue Qiu, Siyu Zhu, Long Qin, Weizhi Wang -
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets (25 Nov 2023)
Andreas Blattmann, Tim Dockhorn, Sumith Kulal, et al.
Daniel Mendelevitch, Maciej Kilian, Dominik Lorenz, Yam Levi, Zion English, Vikram Voleti, Adam Letts, Varun Jampani, Robin Rombach -
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
(7 Nov 2023)
Shiwei Zhang, Jiayu Wang, Yingya Zhang, et al.
Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, Jingren Zhou -
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
(30 Oct 2023)
Haoxin Chen, Menghan Xia, Yingqing He, et al.
Yong Zhang, Xiaodong Cun, Shaoshu Yang, Jinbo Xing, Yaofang Liu, Qifeng Chen, Xintao Wang, Chao Weng, Ying Shan -
Synthesizing Videos from Images for Image-to-Video Adaptation (27 Oct 2023)
[ACM MM 2023] Junbao Zhuo, Xingyu Zhao, Shuhui Wang, et al.
Huimin Ma, Qingming Huang -
VideoDoodles: Hand-Drawn Animations on Videos with Scene-Aware Canvases
(26 Jul 2023)
[ACM Transactions on Graphics] Emilie Yu, Kevin Blackburn-Matzen, Cuong Nguyen, et al.
Oliver Wang, Rubaiat Habib Kazi, Adrien Bousseau -
LaMD: Latent Motion Diffusion for Video Generation (23 Apr 2023)
Yaosi Hu, Zhenzhong Chen, Chong Luo
-
Prompt Image to Life: Training-Free Text-Driven Image-to-Video Generation (2023)
Jinxiu Liu, Yuan Yao, Bingwen Zhu, et al.
Fanyi Wang, Weijian Luo, Jingwen Su, Yanhao Zhang, Yuxiao Wang, Liyuan Ma, Qi Liu, Jiebo Luo, Guo-Jun Qi -
Make It Move: Controllable Image-to-Video Generation With Text Descriptions (31 Mar 2022)
[CVPR 2022] Yaosi Hu, Chong Luo, Zhenzhong Chen
-
VideoGPT: Video Generation using VQ-VAE and Transformers (14 Sep 2021)
Wilson Yan, Yunzhi Zhang, Pieter Abbeel, et al.
Wilson Yan, Yunzhi Zhang, Pieter Abbeel, Aravind Srinivas -
ImaGINator: Conditional Spatio-Temporal GAN for Video Generation (2020)
[WACV 2020] Yaohui Wang, Piotr Bilinski, Francois Bremond, et al.
Yaohui Wang, Piotr Bilinski, Francois Bremond, Antitza Dantcheva
-
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
(30 May 2024)
[ECCV 2024] Muyao Niu, Xiaodong Cun, et al.
Muyao Niu, Xiaodong Cun, Xintao Wang, Yong Zhang, Ying Shan, Yinqiang Zheng -
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling (29 Jan 2024)
[SIGGRAPH 2024] Xiaoyu Shi, Zhaoyang Huang, et al.
Xiaoyu Shi, Zhaoyang Huang, Fu-Yun Wang, Weikang Bian, Dasong Li, Yi Zhang, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li -
Motion-Conditioned Diffusion Model for Controllable Video Synthesis (27 Apr 2023)
Tsai-Shien Chen, Chieh Hubert Lin, et al.
Tsai-Shien Chen, Chieh Hubert Lin, Hung-Yu Tseng, Tsung-Yi Lin, Ming-Hsuan Yang -
I2VControl: Disentangled and Unified Video Motion Synthesis Control
(26 Nov 2024)
Wanquan Feng, Tianhao Qi, et al.
Wanquan Feng, Tianhao Qi, Jiawei Liu, Mingzhen Sun, Pengqi Tu, Tianxiang Ma, Fei Dai, Songtao Zhao, Siyu Zhou, Qian He
-
ATI: Any Trajectory Instruction for Controllable Video Generation (10 Jun 2025)
Angtian Wang, Haibin Huang, Jacob Zhiyuan Fang, et al.
Angtian Wang, Haibin Huang, Jacob Zhiyuan Fang, Yiding Yang, Chongyang Ma -
MOVi: Training-free Text-conditioned Multi-Object Video Generation (29 May 2025)
Aimon Rahman, Jiang Liu, Ze Wang, et al.
Aimon Rahman, Jiang Liu, Ze Wang, Ximeng Sun, Jialian Wu, Xiaodong Yu, Yusheng Su, Vishal M. Patel, Zicheng Liu, Emad Barsoum -
AnimateAnywhere: Rouse the Background in Human Image Animation
(28 Apr 2025)
Xiaoyu Liu, Mingshuai Yao, Yabo Zhang, et al.
Xiaoyu Liu, Mingshuai Yao, Yabo Zhang, Xianhui Lin, Peiran Ren, Xiaoming Li, Ming Liu, Wang -
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
(21 Apr 2025)
Chenjie Cao, Jingkai Zhou, Shikai Li, et al.
Chenjie Cao, Jingkai Zhou, Shikai Li, Jingyun Liang, Chaohui Yu, Fan Wang, Xiangyang Xue, Yanwei Fu -
VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation (2 Apr 2025)
Sixiao Zheng, Zimian Peng, Yanpeng Zhou, et al.
Sixiao Zheng, Zimian Peng, Yanpeng Zhou, Yi Zhu, Hang Xu, Xiangru Huang, Yanwei Fu -
LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis (28 Mar 2025)
Hanlin Wang, Hao Ouyang, Qiuyu Wang, et al.
Wen Wang, Ka Leong Cheng, Qifeng Chen, Yujun Shen, Limin Wang -
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance (20 Mar 2025)
Quanhao Li, Zhen Xing, Rui Wang, et al.
Quanhao Li, Zhen Xing, Rui Wang, Hui Zhang, Qi Dai, Zuxuan Wu -
Tora: Trajectory-oriented Diffusion Transformer for Video Generation (14 Mar 2025)
Zhenghao Zhang, Junchao Liao, Menghao Li, et al.
Zhenghao Zhang, Junchao Liao, Menghao Li, ZuoZhuo Dai, Bingxue Qiu, Siyu Zhu, Long Qin, Weizhi Wang -
LayerAnimate: Layer-level Control for Animation (22 Mar 2025)
Yuxue Yang, Lue Fan, Zuzeng Lin, et al.
Feng Wang, Zhaoxiang Zhang -
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control (22 Mar 2025)
Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, et al.
Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, Willi Menapace, Guocheng Qian, Michael Vasilkovsky, Hsin-Ying Lee, Chaoyang Wang, Jiaxu Zou, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov -
Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation
(10 Mar 2025)
Yingjie Chen, Yifang Men, Yuan Yao, et al.
Yingjie Chen, Yifang Men, Yuan Yao, Miaomiao Cui, Liefeng Bo -
C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation (27 Feb 2025)
Yuhao Li, Mirana Claire Angel, Salman Khan, et al.
Yu Zhu, Jinqiu Sun, Yanning Zhang, Fahad Shahbaz Khan -
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
(25 Feb 2025)
Koichi Namekata, Sherwin Bahmani, Ziyi Wu, et al.
Yash Kant, Igor Gilitschenski, David B. Lindell -
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation
(6 Feb 2025)
Jinbo Xing, Long Mai, Cusuh Ham, et al.
Jinbo Xing, Long Mai, Cusuh Ham, Jiahui Huang, Aniruddha Mahapatra, Chi-Wing Fu, Tien-Tsin Wong, Feng Liu -
MotionBridge: Dynamic Video Inbetweening with Flexible Controls (7 Jan 2025)
Maham Tanveer, Yang Zhou, Simon Niklaus, et al.
Ali Mahdavi Amiri, Hao Zhang, Krishna Kumar Singh, Nanxuan Zhao -
TrackGo: A Flexible and Efficient Method for Controllable Video Generation (5 Jan 2025)
Haitao Zhou, Chuang Wang, Rui Nie, et al.
Jinlin Liu, Dongdong Yu, Qian Yu, Changhu Wang -
Motion-Zero: A Zero-Shot Trajectory Control Framework of Moving Object (8 Jan 2025)
Changgu Chen, Junwei Shu, Gaoqi He, et al.
Changgu Chen, Junwei Shu, Gaoqi He, Changbo Wang, Yang Li -
OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation (12 Dec 2024)
Weiqi Li, Shijie Zhao, Chong Mou, et al.
Xuhan Sheng, Zhenyu Zhang, Qian Wang, Junlin Li, Li Zhang, Jian Zhang -
ObjCtrl-2.5D: Training-free Object Control with Camera Poses (10 Dec 2024)
Zhouxia Wang, Yushi Lan, Shangchen Zhou, Chen Change Loy, et al.
-
Motion Prompting: Controlling Video Generation with Motion Trajectories (3 Dec 2024)
Daniel Geng, Charles Herrmann, Junhwa Hur, et al.
Forrester Cole, Serena Zhang, Tobias Pfaff, Tatiana Lopez-Guevara, Carl Doersch, Yusuf Aytar, Michael Rubinstein, Chen Sun, Oliver Wang, Andrew Owens, Deqing Sun -
I2VControl: Disentangled and Unified Video Motion Synthesis Control
(30 Nov 2024)
Wanquan Feng, Tianhao Qi, Jiawei Liu, et al.
Mingzhen Sun, Pengqi Tu, Tianxiang Ma, Fei Dai, Songtao Zhao, Siyu Zhou, Qian He -
InTraGen: Trajectory-controlled Video Generation for Object Interactions (25 Nov 2024)
Zuhao Liu, Aleksandar Yanev, Ahmad Mahmood, et al.
Ivan Nikolov, Saman Motamed, Wei-Shi Zheng, Xi Wang, Luc Van Gool, Danda Pani Paudel -
MotionBooth: Motion-Aware Customized Text-to-Video Generation
(29 Oct 2024)
Jianzong Wu, Xiangtai Li, Yanhong Zeng, et al.
Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, Yining Li, Yunhai Tong, Kai Chen -
DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships (14 Oct 2024)
Zhang Wan, Sheng Tang, Jiawei Wei
-
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
(11 Jul 2024)
Muyao Niu, Xiaodong Cun, Xintao Wang, et al.
Yong Zhang, Ying Shan, Yinqiang Zheng -
Freetraj: Tuning-free Trajectory Control in Video Diffusion Models (T2V) (24 Jun 2024)
Haonan Qiu, Zhaoxi Chen, Zhouxia Wang, et al.
Haonan Qiu, Zhaoxi Chen, Zhouxia Wang, Yingqing He, Menghan Xia, Ziwei Liu -
Image Conductor: Precision Control for Interactive Video Synthesis (21 Jun 2024)
Yaowei Li, Xintao Wang, Zhaoyang Zhang, et al.
Zhouxia Wang, Ziyang Yuan, Liangbin Xie, Yuexian Zou, Ying Shan -
ReVideo: Remake a Video with Motion and Content
(22 May 2024)
Zuhao Liu, Aleksandar Yanev, Ahmad Mahmood, et al.
Ivan Nikolov, Saman Motamed, Wei-Shi Zheng, Xi Wang, Luc Van Gool, Danda Pani Paudel -
Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control (27 May 2024)
Zhengfei Kuang, Shengqu Cai, Hao He, et al.
Zhengfei Kuang, Shengqu Cai, Hao He, Yinghao Xu, Hongsheng Li, Leonidas Guibas, Gordon Wetzstein -
Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion
(6 May 2024)
Shiyuan Yang, Liang Hou, Haibin Huang, et al.
Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao -
Peekaboo: Interactive Video Generation via Masked-Diffusion (T2V) (19 Apr 2024)
Yash Jain, Anshul Nasery, Vibhav Vineet, et al.
Yash Jain, Anshul Nasery, Vibhav Vineet, Harkirat Behl -
Video Diffusion Models are Training-free Motion Interpreter and Controller (19 Apr 2024)
Zeqi Xiao, Yifan Zhou, Shuai Yang, Xingang Pan, et al.
-
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation (T2V) (8 Apr 2024)
Wan-Duo Kurt Ma, J.P. Lewis, W. Bastiaan Kleijn, et al.
Wan-Duo Kurt Ma, J.P. Lewis, W. Bastiaan Kleijn -
Draganything: Motion Control for Anything Using Entity Representation (15 Mar 2024)
Weijia Wu, Zhuang Li, Yuchao Gu, et al.
Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang -
Boximator: Generating Rich and Controllable Motions for Video Synthesis
(2 Feb 2024)
Jiawei Wang, Yuchen Zhang, Jiaxin Zou, et al.
Yan Zeng, Guoqiang Wei, Liping Yuan, Hang Li -
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling (31 Jan 2024)
Xiaoyu Shi, Zhaoyang Huang, Fu-Yun Wang, et al.
Weikang Bian, Dasong Li, Yi Zhang, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li -
DragNuwa (Image+Text_Traj) (16 Aug 2023)
Shengming Yin, Chenfei Wu, Jian Liang, et al.
Jie Shi, Houqiang Li, Gong Ming, Nan Duan -
MCDiff Motion-Conditioned Diffusion Model for Controllable Video Synthesis (27 Apr 2023)
Haitao Zhou, Chuang Wang, Rui Nie, et al.
Jinlin Liu, Dongdong Yu, Qian Yu, Changhu Wang -
Controllable Video Generation With Sparse Trajectories (16 Dec 2018)
Zekun Hao, Xun Huang, Serge Belongie, et al.
Zekun Hao, Xun Huang, Serge Belongie
-
Follow-Your-Creation: Empowering 4D Creation through Video Inpainting (5 Jun 2025)
Yue Ma, Kunyu Feng, Xinhua Zhang, et al.
Yue Ma, Kunyu Feng, Xinhua Zhang, Hongyu Liu, David Junhao Zhang, Jinbo Xing, Yinhan Zhang, Ayden Yang, Zeyu Wang, Qifeng Chen -
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation
(4 Jun 2025)
Tianyu Huang, Wangguandong Zheng, Tengfei Wang, et al.
Tianyu Huang, Wangguandong Zheng, Tengfei Wang, Yuhao Liu, Zhenwei Wang, Junta Wu, Jie Jiang, Hui Li, Rynson W. H. Lau, Wangmeng Zuo, Chunchao Guo -
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
(21 Apr 2025)
Chenjie Cao, Jingkai Zhou, Shikai Li, et al.
Chenjie Cao, Jingkai Zhou, Shikai Li, Jingyun Liang, Chaohui Yu, Fan Wang, Xiangyang Xue, Yanwei Fu -
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography (10 Apr 2025)
Mengchen Zhang, Tong Wu, Jing Tan, et al.
Mengchen Zhang, Tong Wu, Jing Tan, Ziwei Liu, Gordon Wetzstein, Dahua Lin -
OmniCam: Unified Multimodal Video Generation via Camera Control (3 Apr 2025)
Yang, Xiaoda; Xu, Jiayang; Luan, Kaixuan, et al.
Yang, Xiaoda; Xu, Jiayang; Luan, Kaixuan; Zhan, Xinyu; Qiu, Hongshun; Shi, Shijun; Li, Hao; Yang, Shuai; Zhang, Li; Yu, Checheng; Lu, Cewu; Yang, Lixin -
VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation (2 Apr 2025)
Sixiao Zheng, Zimian Peng, Yanpeng Zhou, et al.
Yi Zhu, Hang Xu, Xiangru Huang, Yanwei Fu -
Motion Prompting: Controlling Video Generation with Motion Trajectories (27 Mar 2025)
Daniel Geng, Charles Herrmann, Junhwa Hur, et al.
Daniel Geng, Charles Herrmann, Junhwa Hur, Forrester Cole, Serena Zhang, Tobias Pfaff, Tatiana Lopez-Guevara, Carl Doersch, Yusuf Aytar, Michael Rubinstein, Chen Sun, Oliver Wang, Andrew Owens, Deqing Sun -
Optical Flow Meets Video Diffusion Model for Enhanced Camera-Controlled Video Synthesis (25 Mar 2025)
Wonjoon Jin, Qi Dai, Chong Luo, et al.
Seung-Hwan Baek, Sunghyun Cho, POSTECH, Microsoft Research Asia -
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers (22 Mar 2025)
Sherwin Bahmani, Ivan Skorokhodov, Guocheng Qian, et al.
Aliaksandr Siarohin, Willi Menapace, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov -
Aether: Geometric-Aware Unified World Modeling
(18 Mar 2025)
Aether Team, et al.
Haoyi Zhu, Yifan Wang, Jianjun Zhou, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Chunhua Shen, Jiangmiao Pang, Tong He -
EgoSim: Egocentric Exploration in Virtual Worlds with Multi-modal Conditioning (16 Mar 2025)
Wei Yu, Songheng Yin, Steve Easterbrook, Animesh Garg, et al.
-
I2V3D: Controllable Image-to-Video Generation with 3D Guidance
(12 Mar 2025)
Zhiyuan Zhang, Dongdong Chen, Jing Liao
-
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
(13 Mar 2025)
Hao He, Ceyuan Yang, Shanchuan Lin, et al.
Yinghao Xu, Meng Wei, Liangke Gui, Qi Zhao, Gordon Wetzstein, Lu Jiang, Hongsheng Li -
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
(13 Mar 2025)
Hao He, Yinghao Xu, Yuwei Guo, et al.
Gordon Wetzstein, Bo Dai, Hongsheng Li, Ceyuan Yang -
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
(14 Mar 2025)
Jianhong Bai, Menghan Xia, Xiao Fu, et al.
Xintao Wang, Lianrui Mu, Jinwen Cao, Zuozhu Liu, Haoji Hu, Xiang Bai, Pengfei Wan, Di Zhang -
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control
(5 Mar 2025)
Xuanchi Ren, Tianchang Shen, Jiahui Huang, et al.
Huan Ling, Yifan Lu, Merlin Nimier-David, Thomas MΓΌller, Alexander Keller, Sanja Fidler, Jun Gao -
Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation (10 Mar 2025)
Yingjie Chen, Yifang Men, Yuan Yao, et al.
Yingjie Chen, Yifang Men, Yuan Yao, Miaomiao Cui, Liefeng Bo -
I2VCONTROL-CAMERA: Precise Video Camera Control with Adjustable Motion Strength (28 Feb 2025)
Wanquan Feng, Jiawei Liu, Pengqi Tu, et al.
Tianhao Qi, Mingzhen Sun, Tianxiang Ma, Songtao Zhao, Siyu Zhou, Qian He -
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation (12 Feb 2025)
Qinghe Wang, Yawen Luo, Xiaoyu Shi, et al.
Xu Jia, Huchuan Lu, Tianfan Xue, Xintao Wang, Pengfei Wan, Di Zhang, Kun Gai -
Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion
(12 Feb 2025)
Shiyuan Yang, Liang Hou, Haibin Huang, et al.
Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao -
RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control (14 Feb 2025)
Teng Li, Guangcong Zheng, Rui Jiang, et al.
Shuigenzhan, Tao Wu, Yehao Lu, Yining Lin, Xi Li -
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation (7 Feb 2025)
Xiao Fu, Xian Liu, Xintao Wang, et al.
Sida Peng, Menghan Xia, Xiaoyu Shi, Ziyang Yuan, Pengfei Wan, Di Zhang, Dahua Lin -
Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training (8 Dec 2024)
Zhenghong Zhou, Jie An, Jiebo Luo, et al.
-
CamI2V: Camera-Controlled Image-to-Video Diffusion Model (4 Dec 2024)
Guangcong Zheng, Teng Li, Rui Jiang, et al.
Yehao Lu, Tao Wu, Xi Li -
I2VControl: Disentangled and Unified Video Motion Synthesis Control
(30 Nov 2024)
Zhiyuan Zhang, Dongdong Chen, Jing Liao
-
Trajectory Attention: Enhancing Video Generation with Fine-Grained Motion Control (28 Nov 2024)
Zeqi Xiao, Wenqi Ouyang, Yifan Zhou, et al.
Shuai Yang, Lei Yang, Jianlou Si, Xingang Pan -
Video Diffusion Models are Training-free Motion Interpreter and Controller (12 Nov 2024)
Zeqi Xiao, Yifan Zhou, Shuai Yang, Xingang Pan, et al.
-
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
(7 Nov 2024)
Wenqiang Sun, Shuo Chen, Fangfu Liu, et al.
Zilong Chen, Yueqi Duan, Jun Zhang, Yikai Wang -
Boosting Camera Motion Control for Video Diffusion Transformers (14 Oct 2024)
Soon Yau Cheong, Duygu Ceylan, Armin Mustafa, et al.
Andrew Gilbert, Chun-Hao Paul Huang -
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention (14 Oct 2024)
Dejia Xu, Yifan Jiang, Chen Huang, et al.
Liangchen Song, Thorsten Gernoth, Liangliang Cao, Zhangyang Wang, Hao Tang -
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis (3 Sep 2024)
Wangbo Yu, Jinbo Xing, Li Yuan, et al.
Wenbo Hu, Xiaoyu Li, Zhipeng Huang, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian -
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
(17 Jul 2024)
Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, et al.
Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, Willi Menapace, Guocheng Qian, Michael Vasilkovsky, Hsin-Ying Lee, Chaoyang Wang, Jiaxu Zou, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov -
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation (16 Jul 2024)
Zhouxia Wang, Ziyang Yuan, Xintao Wang, et al.
Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, Ying Shan -
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation (4 Jun 2024)
Dejia Xu, Weili Nie, Chao Liu, et al.
Sifei Liu, Jan Kautz, Zhangyang Wang, Arash Vahdat -
MotionMaster: Training-free Camera Motion Transfer For Video Generation (1 May 2024)
Teng Hu, Jiangning Zhang, Ran Yi, et al.
Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma -
MOTIONFLOW: Learning Implicit Motion Flow for Complex Camera Trajectory Control in Video Generation
(Dec 2024)
Author list not fully provided
-
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
(6 May 2025)
[SIGGRAPH 2025] Shiyi Zhang*, Junhao Zhuang*, et al.
Shiyi Zhang*, Junhao Zhuang*, Zhaoyang Zhang, Ying Shan, Yansong Tang -
Follow-Your-Motion: Video Motion Transfer via Efficient Spatial-Temporal Decoupled Finetuning (5 Jun 2025)
Yue Ma, Yulong Liu, Qiyuan Zhu, et al.
Yue Ma, Yulong Liu, Qiyuan Zhu, Ayden Yang, Kunyu Feng, Xinhua Zhang, Zhifeng Li, Sirui Han, Chenyang Qi, Qifeng Chen -
MotionPro: A Precise Motion Controller for Image-to-Video Generation (29 May 2025)
Zhongwei Zhang, Fuchen Long, Zhaofan Qiu, et al.
Zhongwei Zhang, Fuchen Long, Zhaofan Qiu, Yingwei Pan, Wu Liu, Ting Yao, Tao Mei -
DualReal: Joint Training for Lossless Identity-Motion Fusion in Video Customization
(4 May 2025)
Wenchuan Wang, Mengqi Huang, Yijing Tu, et al.
Wenchuan Wang, Mengqi Huang, Yijing Tu, Zhendong Mao -
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer (15 Apr 2025)
Xiang Wang, Shiwei Zhang, Longxiang Tang, et al.
Xiang Wang, Shiwei Zhang, Longxiang Tang, Yingya Zhang, Changxin Gao, Yuehuan Wang, Nong Sang -
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models(hand mask sequence as control signal) (4 Apr 2025)
Rick Akkerman, Haiwen Feng, Michael J. Black, et al.
Rick Akkerman, Haiwen Feng, Michael J. Black, Dimitrios Tzionas, Victoria FernΒ΄andez Abrevaya -
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance (3 Apr 2025)
Yuxuan Luo, Zhengkun Rong, Lizhen Wang, et al.
Yuxuan Luo, Zhengkun Rong, Lizhen Wang, Longhao Zhang, Tianshu Hu, Yongming Zhu -
Video Motion Transfer with Diffusion Transformers (27 Mar 2025)
Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov, et al.
Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov, Philip Torr, Fabio Pizzati -
Motion Prompting: Controlling Video Generation with Motion Trajectories
(27 Mar 2025)
Daniel Geng, Charles Herrmann, Junhwa Hur, et al.
Daniel Geng, Charles Herrmann, Junhwa Hur, Forrester Cole, Serena Zhang, Tobias Pfaff, Tatiana Lopez-Guevara, Carl Doersch, Yusuf Aytar, Michael Rubinstein, Chen Sun, Oliver Wang, Andrew Owens, Deqing Sun -
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models (25 Mar 2025)
Yufei Cai, et al.
Yufei Cai, Hu Han, Yuxiang Wei, Shiguang Shan, Xilin Chen -
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance (20 Mar 2025)
Quanhao Li, Zhen Xing, Rui Wang, et al.
Quanhao Li, Zhen Xing, Rui Wang,βHui Zhang, Qi Dai, Zuxuan Wu -
DreamRelation: Relation-Centric Video Customization (10 Mar 2025)
Yujie Wei, Shiwei Zhang, Hangjie Yuan, et al.
Yujie Wei, Shiwei Zhang, Hangjie Yuan, Biao Gong, Longxiang Tang, Xiang Wang, Haonan Qiu, Hengjia Li, Shuai Tan, Yingya Zhang, Hongming Shan -
MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching (18 Feb 2025)
Yen-Siang Wu, Chi-Pin Huang, Fu-En Yang, et al.
Yen-Siang Wu, Chi-Pin Huang, Fu-En Yang, Yu-Chiang Frank Wang -
MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent (5 Feb 2025)
Xinyao Liao, Xianfang Zeng, Liao Wang
Xinyao Liao, Xianfang Zeng, Liao Wang, Gang Yu, Guosheng Lin, Chi Zhang -
Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss (13 Jan 2025)
Xinyu Zhang, Zicheng Duan, Dong Gong, et al.
Xinyu Zhang, Zicheng Duan, Dong Gong, Lingqiao Liu -
SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing (13 Jan 2025)
Varun Biyyala, Bharat Chanderprakash Kathuria, Jialu Li, et al.
Varun Biyyala, Bharat Chanderprakash Kathuria, Jialu Li, and Youshan Zhang -
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
(9 Jan 2025)
Zekai Gu, Rui Yan, Jiahao Lu, et al.
Zekai Gu, Rui Yan, Jiahao Lu, Peng Li, Zhiyang Dou, Chenyang Si, Zhen Dong, Qifeng Liu, Cheng Lin, Ziwei Liu, Wenping Wang, Yuan Liu -
Spectral Motion Alignment for Video Motion Transfer using Diffusion Models (19 Dec 2024)
Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov, et al.
Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov, Philip Torr, Fabio Pizzati -
MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models (2 Dec 2024)
Xiaomin Li, Xu Jia, Qinghe Wang, et al.
Xiaomin Li, Xu Jia, Qinghe Wang -
OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion ModelsOnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models (15 Nov 2024)
Mathis Koroglu, Hugo Caselles-DuprΓ©, Guillaume Jeanneret Sanmiguel, et al.
Mathis Koroglu, Hugo Caselles-DuprΓ©, Guillaume Jeanneret Sanmiguel -
Video Diffusion Models are Training-free Motion Interpreter and Controller (12 Nov 2024)
Zeqi Xiao, Yifan Zhou, Shuai Yang, et al.
Zeqi Xiao, Yifan Zhou, Shuai Yang, Xingang Pan -
AnimateAnything: Consistent and Controllable Animation for Video Generation
(16 Nov 2024)
Guojun Lei, Chi Wang, Hong Li, et al.
Guojun Lei, Chi Wang, Hong Li, Rong Zhang, Yikai Wang, Weiwei Xu -
Motionbooth: Motion-aware customized text-to-video generation
(29 Oct 2024)
Jianzong Wu, Xiangtai Li, Yanhong Zeng, et al.
Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, Yining Li, Yunhai Tong, Kai Chen -
MotionClone: Training-Free Motion Cloning for Controllable Video Generation
(22 Oct 2024)
Pengyang Ling, Jiazi Bu, Pan Zhang, et al.
Pengyang Ling, Jiazi Bu, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Tong Wu, Huaian Chen, Jiaqi Wang, Yi Jin -
Motion Inversion for Video Customization (16 Oct 2024)
Luozhou Wang, Ziyang Mai, Guibao Shen, et al.
Luozhou Wang, Ziyang Mai, Guibao Shen, Yixuan Liang, Xin Tao, Pengfei Wan, Di Zhang, Yijun Li, Yingcong Chen -
Zero-Shot Controllable Image-to-Video Animation via Motion Decomposition (21 Jul 2024)
Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov, et al.
Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov, Philip Torr, Fabio Pizzati -
DreamMotion: Space-Time Self-Similar Score Distillation for Zero-Shot Video Editing (15 Jul 2024)
Hyeonho Jeong, Jinho Chang, Geon Yeong Park, et al.
Hyeonho Jeong, Jinho Chang, Geon Yeong Park, and Jong Chul Ye -
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
(1 Jun 2024)
Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, et al.
Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Qingkun Su, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu -
Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models
(28 Aug 2024)
Yixuan Ren, Yang Zhou, Jimei Yang, et al.
Yixuan Ren, Yang Zhou, Jimei Yang, Jing Shi, Difan Liu, Feng Liu, Mingi Kwon, Abhinav Shrivastava -
Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation
(7 Dec 2023)
Yingjie Chen, Yifang Men, Yuan Yao, et al.
Yingjie Chen, Yifang Men, Yuan Yao, Miaomiao Cui, Liefeng Bo -
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion (7 Dec 2023)
Yujie Wei, Shiwei Zhang, Zhiwu Qing
Yujie Wei, Shiwei Zhang, Zhiwu Qing, Hangjie Yuan, Zhiheng Liu, Yu Liu, Yingya Zhang, Jingren Zhou, Hongming Shan -
VMC: Video Motion Customization with Pre-trained Diffusion Models (1 Dec 2023)
Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye, et al.
Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye -
Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer (3 Dec 2023)
Danah Yatim, Rafail Fridman, Omer Bar-Tal, et al.
Danah Yatim, Rafail Fridman, Omer Bar-Tal, Yoni Kasten, Tali Dekel -
MotionDirector: Motion Customization for Text-to-Video Diffusion Models
(12 Oct 2023)
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, et al.
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jia-Wei Liu, Weijia Wu, Jussi Keppo, and Mike Zheng Shou
-
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
(4 Apr 2025)
Fa-Ting Hong, Zunnan Xu, Zixiang Zhou, et al.
Jun Zhou, Xiu Li, Qin Lin, Qinglin Lu, Dan Xu -
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
(27 Mar 2025)
Jinwei Qi, Chaonan Ji, Sheng Xu, et al.
Peng Zhang, Bang Zhang, Liefeng Bo -
MuseTalk: Real-Time High-Fidelity Video Dubbing via Spatio-Temporal Sampling
(26 Mar 2025)
Yue Zhang, Zhizhou Zhong, Minhao Liu, et al.
Zhaokang Chen, Bin Wu, Yubin Zeng, Chao Zhan, Junxin Huang, Yingjie He, Wenjiang Zhou -
Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control
(14 Mar 2025)
[ICLR 2025] Hejia Chen, Haoxian Zhang, Shoulong Zhang, et al.
Xiaoqiang Liu, Sisi Zhuang, Yuan Zhang, Pengfei Wan, Di Zhang, Shuai Li -
Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture
(1 Mar 2025)
[CVPR 2025] Xuanchen Li, Jianyu Wang, Yuhao Cheng, et al.
Yikun Zeng, Xingyu Ren, Wenhan Zhu, Weiming Zhao, Yichao Yan -
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
(17 Feb 2025)
Junxian Ma, Shiwen Wang, Jian Yang, et al.
Junyi Hu, Jian Liang, Guosheng Lin, Jingbo chen, Kai Li, Yu Meng -
Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model
(13 Feb 2025)
Fei Shen, Cong Wang, Junyao Gao, et al.
Qin Guo, Jisheng Dang, Jinhui Tang, Tat-Seng Chua -
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
(13 Feb 2025)
Gaojie Lin, Jianwen Jiang, Jiaqi Yang, et al.
Zerong Zheng, Chao Liang -
CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation
(23 Jan 2025)
[ICLR 2025] Gaojie Lin, Jianwen Jiang, Chao Liang, et al.
Tianyun Zhong, Jiaqi Yang, Yanbo Zheng -
Float: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait
(4 Dec 2024)
Taekyung Ki, Dongchan Min, Gyeongsu Chae
-
SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model
(28 Nov 2024)
Weipeng Tan, Chuming Lin, Chengming Xu, et al.
Xiaozhong Ji, Junwei Zhu, Chengjie Wang, Yunsheng Wu, Yanwei Fu -
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
(15 Nov 2024)
[CVPR 2025] Rang Meng, Xingyu Zhang, Yuming Li, et al.
Chenguang Ma -
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
(31 Oct 2024)
[NeurIPS 2024] Sicheng Xu, Guojun Chen, Yu-Xiao Guo, et al.
Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, Baining Guo -
MegActor-Ξ£: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer
(27 Aug 2024)
[AAAI 2025] Shurong Yang, Huadong Li, Juhao Wu, et al.
Minhao Jing, Linze Li, Renhe Ji, Jiajun Liang, Haoqiang Fan, Jin Wang -
MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls
(25 Aug 2024)
Yuxuan Bian, Ailing Zeng, Xuan Ju, et al.
Xian Liu, Zhaoyang Zhang, Wei Liu, Qiang Xu -
Dreamtalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
(10 Aug 2024)
Yifeng Ma, Shiwei Zhang, Jiayu Wang, et al.
Xiang Wang, Yingya Zhang, Zhidong Deng -
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions
(12 Jul 2024)
[AAAI 2025] Zhiyuan Chen, Jiajiong Cao, Zhiquan Chen, et al.
Yuming Li, Chenguang Ma -
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation (16 Jun 2024)
Mingwang Xu, Hui Li, Qingkun Su, et al.
Hanlin Shang, Liwei Zhang, Ce Liu, Jingdong Wang, Yao Yao, Siyu Zhu -
V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation
(4 Jun 2024)
Cong Wang, Kuan Tian, Jun Zhang, et al.
Yonghang Guan, Feng Luo, Fei Shen, Zhiwei Jiang, Qing Gu, Xiao Han, Wei Yang -
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
(30 May 2024)
[ECCV 2024] Muyao Niu, Xiaodong Cun, Xintao Wang, et al.
Muyao Niu, Xiaodong Cun, Xintao Wang, Yong Zhang, Ying Shan, Yinqiang Zheng -
InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
(24 May 2024)
Yuchi Wang, Junliang Guo, Jianhong Bai, et al.
Runyi Yu, Tianyu He, Xu Tan, Xu Sun, Jiang Bian -
SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis
(28 Apr 2024)
[CVPR 2024] Ziqiao Peng, Wentao Hu, Yue Shi, et al.
Xiangyu Zhu, Xiaomei Zhang, Hao Zhao, Jun He, Hongyan Liu, Zhaoxin Fan -
EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis
(2 Apr 2024)
Shuai Tan, Bin Ji, Mengxiao Bi, et al.
Ye Pan -
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
(26 Mar 2024)
Huawei Wei, Zejun Yang, Zhisheng Wang
-
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
(27 Feb 2024)
[ECCV 2024] Linrui Tian, Qi Wang, Bang Zhang, et al.
Liefeng Bo -
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior
(7 Dec 2023)
Xusen Sun, Longhao Zhang, Hao Zhu, et al.
Peng Zhang, Bang Zhang, Xinya Ji, Kangneng Zhou, Daiheng Gao, Liefeng Bo, Xun Cao -
Audio-Driven Co-Speech Gesture Video Generation (5 Dec 2022)
[NeurIPS 2022] Xian Liu, Qianyi Wu, Hang Zhou, et al.
Xian Liu, Qianyi Wu, Hang Zhou, Yuanqi Du, Wayne Wu, Dahua Lin, Ziwei Liu -
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild (23 Aug 2020)
K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, et al.
[ACM MM 2020] K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar -
Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose (5 Mar 2020)
[TMM 2022] Ran Yi, Zipeng Ye, Juyong Zhang, et al.
Ran Yi, Zipeng Ye, Juyong Zhang, Hujun Bao, Yong-Jin Liu -
Robust One Shot Audio to Video Generation (2020)
[CVPRW 2020] Neeraj Kumar, Srishti Goel, Ankur Narang, et al.
Neeraj Kumar, Srishti Goel, Ankur Narang, Mujtaba Hasan
-
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice*
(7 Mar 2025)
Hongwei Yi, Tian Ye, Shitong Shao, Xuancheng Yang, et al.
Hongwei Yi, Tian Ye, Shitong Shao, Xuancheng Yang, Jiantong Zhao, Hanzhong Guo, Terrance Wang, Qingyu Yin, Zeke Xie, Lei Zhu, Wei Li, Michael Lingelbach, Daquan Zhou -
MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls
(19 Dec 2024)
[AAAI 2025] Yuxuan Bian, Ailing Zeng, Xuan Ju, et al.
Yuxuan Bian, Ailing Zeng, Xuan Ju, Xian Liu, Zhaoyang Zhang, Wei Liu, Qiang Xu -
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation (19 Dec 2024)
Moayed Haji-Ali, Willi Menapace, Aliaksandr Siarohin, et al.
Moayed Haji-Ali, Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Alper Canberk, Kwot Sin Lee, Vicente Ordonez, Sergey Tulyakov -
Audio-Synchronized Visual Animation (08 Mar 2024)
[ECCV 2024] Lin Zhang, Shentong Mo, Yijing Zhang, et al.
Lin Zhang, Shentong Mo, Yijing Zhang, Pedro Morgado -
Ta2v: Text-audio guided video generation (01 Jan 2024)
[TMM] Minglu Zhao, Wenmin Wang, Tongbao Chen, et al.
Minglu Zhao, Wenmin Wang, Tongbao Chen, Rui Zhang, Ruochen Li -
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation (28 Sep 2023)
[AAAI 2024] Guy Yariv, Itai Gat, Sagie Benaim, et al.
Guy Yariv, Itai Gat, Sagie Benaim, Lior Wolf, Idan Schwartz, Yossi Adi
-
Wan: Open and Advanced Large-Scale Video Generative Models (2025)
WanTeam, Ang Wang, Baole Ai, et al.
Bin Wen,Chaojie Mao,Chen-Wei Xie,Di Chen,Feiwu Yu,Haiming Zhao,Jianxiao Yang,Jianyuan Zeng,Jiayu Wang,Jingfeng Zhang,Jingren Zhou,Jinkai Wang,Jixuan Chen,Kai Zhu,Kang Zhao,Keyu Yan,Lianghua Huang,Mengyang Feng,Ningyi Zhang,Pandeng Li,Pingyu Wu,Ruihang Chu,Ruili Feng,Shiwei Zhang,Siyang Sun,Tao Fang,Tianxing Wang,Tianyi Gui,Tingyu Weng,Tong Shen,Wei Lin,Wei Wang1,Wei Wang2,Wenmeng Zhou,Wente Wang,Wenting Shen,Wenyuan Yu,Xianzhong Shi,Xiaoming Huang,Xin Xu,Yan Kou,Yangyu Lv,Yifei Li,Yijing Liu,Yiming Wang,Yingya Zhang,Yitong Huang,Yong Li,You Wu,Yu Liu,Yulin Pan,Yun Zheng,Yuntao Hong,Yupeng Shi,Yutong Feng,Zeyinzi Jiang,Zhen Han,Zhi-Fan Wu,Ziyu Liu -
Text-Animator: Controllable Visual Text Video Generation (25 Jun 2024)
Lin Liu, Quande Liu, Shengju Qian, et al.
Yuan Zhou, Wengang Zhou, Houqiang Li, Lingxi Xie, Qi Tian
-
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models
(8 Jun 2025)
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, et al.
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, Jaehong Yoon, Soo Ye Kim, Zhe Lin, Sung Ju Hwang -
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding
(15 Apr 2025)
Dianbing Xi, Jiepeng Wang, Yuanzhi Liang, et al.
Xi Qiu, Yuchi Huo, Rui Wang, Chi Zhang, Xuelong Li -
StyleMaster: Stylize Your Video with Artistic Generation and Translation (10 Dec 2024)
Zixuan Ye, Huijuan Huang, Xintao Wang, et al.
Pengfei Wan, Di Zhang, Wenhan Luo -
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation (19 Mar 2024)
[CVPR 2024] Shuai Yang, Yifan Zhou, Ziwei Liu, et al.
Chen Change Loy -
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter (1 Dec 2023)
[TOG 2024] Gongye Liu, Menghan Xia, Yong Zhang, et al.
Haoxin Chen, Jinbo Xing, Yibo Wang, Xintao Wang, Yujiu Yang, Ying Shan -
UniVST: A Unified Framework for Training-free Localized Video Style Transfer (26 Oct 2024)
Quanjian Song, Mingbao Lin, Wengyi Zhan, et al.
Shuicheng Yan, Liujuan Cao, Rongrong Ji -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation (13 Jun 2023)
[Siggraph Asia 2023] Shuai Yang, Yifan Zhou, Ziwei Liu, et al.
Chen Change Loy -
VToonify: Controllable High-Resolution Portrait Video Style Transfer (22 Sep 2022)
[Siggraph Asia 2022] Shuai Yang, Liming Jiang, Ziwei Liu, et al.
Chen Change Loy
-
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control (7 Jan 2025)
Zekai Gu, Rui Yan, Jiahao Lu, Peng Li, et al.
Zhiyang Dou, Chenyang Si, Zhen Dong, Qifeng Liu, Cheng Lin, Ziwei Liu, Wenping Wang, Yuan Liu -
GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking (5 Jan 2025)
Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, et al.
Yijin Li, Fu-Yun Wang, Hongsheng Li -
Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation (8 Dec 2024)
[CVPR 2025] Hyeonho Jeong, Chun-Hao Paul Huang, et al.
Jong Chul Ye, Niloy Mitra, Duygu Ceylan -
Drag-A-Video: Non-rigid Video Editing with Point-based Interaction (5 Dec 2023)
Yao Teng, Enze Xie, Yue Wu, Haoyu Han, et al.
Zhenguo Li, Xihui Liu
-
Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention (04 Dec 2024)
Hannan Lu, Xiaohe Wu, Shudong Wang, et al.
Hannan Lu, Xiaohe Wu, Shudong Wang, Xiameng Qin, Xinyu Zhang, Junyu Han, Wangmeng Zuo, Ji Tao -
MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control (21 Nov 2024)
Ruiyuan Gao, Kai Chen, Bo Xiao, et al.
Ruiyuan Gao, Kai Chen, Bo Xiao, Lanqing Hong, Zhenguo Li, Qiang Xu -
DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes (06 Sep 2024)
Jianbiao Mei, Tao Hu, Xuemeng Yang, et al.
Jianbiao Mei, Tao Hu, Xuemeng Yang, Licheng Wen, Yu Yang, Tiantian Wei, Yukai Ma, Min Dou, Botian Shi, Yong Liu -
DiVE: DiT-based Video Generation with Enhanced Control (03 Sep 2024)
Junpeng Jiang, Gangyi Hong, Lijun Zhou, et al.
Junpeng Jiang, Gangyi Hong, Lijun Zhou, Enhui Ma, Hengtong Hu, Xia Zhou, Jie Xiang, Fan Liu, Kaicheng Yu, Haiyang Sun, Kun Zhan, Peng Jia, Miao Zhang -
Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation (03 Jun 2024)
Enhui Ma, Lijun Zhou, Tao Tang, et al.
Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu -
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation (11 Mar 2024)
Guosheng Zhao, Xiaofeng Wang, Zheng Zhu, et al.
Guosheng Zhao, Xiaofeng Wang, Zheng Zhu, Xinze Chen, Guan Huang, Xiaoyi Bao, Xingang Wang -
Panacea: Panoramic and Controllable Video Generation for Autonomous Driving (28 Nov 2023)
Yuqing Wen, Yucheng Zhao, Yingfei Liu, et al.
Yuqing Wen, Yucheng Zhao, Yingfei Liu, Fan Jia, Yanhui Wang, Chong Luo, Chi Zhang, Tiancai Wang, Xiaoyan Sun, Xiangyu Zhang -
MagicDrive: Street View Generation with Diverse 3D Geometry Control (04 Oct 2023)
Ruiyuan Gao, Kai Chen, Enze Xie, et al.
Ruiyuan Gao, Kai Chen, Enze Xie, Lanqing Hong, Zhenguo Li, Dit-Yan Yeung, Qiang Xu
-
Aether: Geometric-Aware Unified World Modeling
(18 Mar 2025)
Aether Team, et al.
Haoyi Zhu, Yifan Wang, Jianjun Zhou, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Chunhua Shen, Jiangmiao Pang, Tong He -
VACE: All-in-One Video Creation and Editing
(11 Mar 2025)
Zeyinzi Jiang, Zhen Han, Chaojie Mao, et al.
Zeyinzi Jiang, Zhen Han, Chaojie Mao, Jingfeng Zhang, Yulin Pan, Yu Liu -
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention
(25 Mar 2025)
Xuan Ju, Weicai Ye, Quande Liu, et al.
Xuan Ju, Weicai Ye, Quande Liu, Qiulin Wang, Xintao Wang, Pengfei Wan, Di Zhang, Kun Gai, Qiang Xu -
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
(31 Mar 2025)
Shengqiong Wu, Weicai Ye, Jiahao Wang, et al.
Shengqiong Wu, Weicai Ye, Jiahao Wang, Quande Liu, Xintao Wang, Pengfei Wan, Di Zhang, Kun Gai, Shuicheng Yan, Hao Fei, Tat-Seng Chua -
VideoComposer: Compositional Video Synthesis with Motion Controllability
(6 Jun 2023)
Xiang Wang, Hangjie Yuan, Shiwei Zhang, et al.
Xiang Wang, Hangjie Yuan, Shiwei Zhang, Dayou Chen, Jiuniu Wang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou