Skip to content

HuangZiheng-o-O/self-correcting_motion_synthesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Self-Correcting Human Motion Synthesis with Video Analysis

For trajectory_guidance, please check my repository trajectory_guidance

This project tackles the challenge of generating detailed and context-rich 3D human motions from textual descriptions beyond standard training data. The framework integrates a multi-agent system—powered by large language models and a vision-language module—to segment, synthesize, and refine motion outputs in an iterative loop. By employing a mask-transformer architecture with body part-specific encoders and codebooks, we achieve granular control over both short and extended motion sequences. After initial generation, an automated review process uses video-based captioning to identify discrepancies and generate corrective instructions, allowing each body region to be accurately adjusted. Experimental results on the HumanML3D benchmark demonstrate that this approach not only attains competitive performance against recent methods but excels in handling long-form prompts and multi-step motion compositions. Comprehensive user studies further indicate significant improvements in realism and fidelity for complex scenarios.

A person does Bruce Lee's classic kicks, and runs forward with right arm extending forward, and trying to avoid sphere obstacles in his way.

Bruce.Lee.s.classic.kicks.mp4

A woman picks up speed from a walk to a run, holding the T-pose.

T-pose.mp4

A person sits on the floor with hands resting on their knees, then reaches forward with their right arm trying to grab something.

grab.mp4

An angry midfielder performs a slide tackle on another player.

slide.tackle.mp4

Model

An illustrative example of the workflow: framework

example

Acknowlegements

Sincerely thank the open-sourcing of these works where the code is based on: momask-codes, deep-motion-editing, Muse, vector-quantize-pytorch, T2M-GPT, MDM and MLD

About

Self-Correcting Human Motion Synthesis with Video Analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published