Self-Correcting Human Motion Synthesis with Video Analysis

For trajectory_guidance, please check my repository trajectory_guidance

This project tackles the challenge of generating detailed and context-rich 3D human motions from textual descriptions beyond standard training data. The framework integrates a multi-agent system—powered by large language models and a vision-language module—to segment, synthesize, and refine motion outputs in an iterative loop. By employing a mask-transformer architecture with body part-specific encoders and codebooks, we achieve granular control over both short and extended motion sequences. After initial generation, an automated review process uses video-based captioning to identify discrepancies and generate corrective instructions, allowing each body region to be accurately adjusted. Experimental results on the HumanML3D benchmark demonstrate that this approach not only attains competitive performance against recent methods but excels in handling long-form prompts and multi-step motion compositions. Comprehensive user studies further indicate significant improvements in realism and fidelity for complex scenarios.

A person does Bruce Lee's classic kicks, and runs forward with right arm extending forward, and trying to avoid sphere obstacles in his way.

Bruce.Lee.s.classic.kicks.mp4

A woman picks up speed from a walk to a run, holding the T-pose.

T-pose.mp4

A person sits on the floor with hands resting on their knees, then reaches forward with their right arm trying to grab something.

grab.mp4

An angry midfielder performs a slide tackle on another player.

slide.tackle.mp4

An illustrative example of the workflow：

Acknowlegements

Sincerely thank the open-sourcing of these works where the code is based on: momask-codes, deep-motion-editing, Muse, vector-quantize-pytorch, T2M-GPT, MDM and MLD

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.idea		.idea
assets		assets
common		common
data		data
dataset		dataset
example_data		example_data
exp		exp
models		models
motion_loaders		motion_loaders
options		options
prepare		prepare
test/motion		test/motion
testjson		testjson
trajectory_test		trajectory_test
utils		utils
video_analysis		video_analysis
visualization		visualization
.gitattributes		.gitattributes
.gitignore		.gitignore
1.ipynb		1.ipynb
1.py		1.py
LICENSE		LICENSE
README.md		README.md
blend_t2m_time.py		blend_t2m_time.py
blend_t2m_time_batch.py		blend_t2m_time_batch.py
edit_t2m.py		edit_t2m.py
edit_t2m_time.py		edit_t2m_time.py
edit_t2m_time_zsh.py		edit_t2m_time_zsh.py
environment.yml		environment.yml
eval_t2m_trans_res.py		eval_t2m_trans_res.py
eval_t2m_trans_time.py		eval_t2m_trans_time.py
eval_t2m_vq.py		eval_t2m_vq.py
gen_t2m.py		gen_t2m.py
gen_t2m_time.py		gen_t2m_time.py
gen_t2m_time_batch.py		gen_t2m_time_batch.py
prepare_blending_data.py		prepare_blending_data.py
re-captions_4o.zip		re-captions_4o.zip
requirements.txt		requirements.txt
test.py		test.py
train_res_Timesformer.py		train_res_Timesformer.py
train_res_transformer.py		train_res_transformer.py
train_t2m_Timesformer_recap.py		train_t2m_Timesformer_recap.py
train_t2m_group.py		train_t2m_group.py
train_t2m_timesformer.py		train_t2m_timesformer.py
train_t2m_transformer.py		train_t2m_transformer.py
train_time_transformer.py		train_time_transformer.py
train_vq.py		train_vq.py
train_vq_general.py		train_vq_general.py
train_vq_general_new.py		train_vq_general_new.py
train_vq_multi.py		train_vq_multi.py
visual.py		visual.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Self-Correcting Human Motion Synthesis with Video Analysis

Acknowlegements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

HuangZiheng-o-O/self-correcting_motion_synthesis

Folders and files

Latest commit

History

Repository files navigation

Self-Correcting Human Motion Synthesis with Video Analysis

Acknowlegements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages