MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions

Description

Official PyTorch implementation of the paper "MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions"(TCSVT 2024).

Installation

1. Create conda environment

conda create --name mlp python=3.7
conda activate mlp

Install PyTorch 1.13 inside the conda environment, and install the following packages:

conda install ipykernel
pip install matplotlib
pip install tqdm
pip install scipy h5py coloredlogs 
pip install omegaconf
pip install hydra-core
pip install seaborn
pip install peft
pip install einops
pip install tensorboard tensorboardX tensorboard_logger
pip install orjson

The code was tested on Python 3.7 and PyTorch 1.13.

2. Download the datasets

Motion representation

Thanks to HumanML3D, we adopt the same method to extract the joint information representation of both the BABEL/HumanML3D (Restore) dataset. Please follow the instructions of the raw_pose_processing.ipynb of the HumanML3D repo, to get the pose_data folder. During this process, you need to download the corresponding AMASS data set according to the prompts. It may take up to 2 days to complete the entire data processing. After above, copy or symlink the pose_data folder in datasets/motions/:

ln -s /path/to/HumanML3D/pose_data datasets/motions/pose_data

Then, we take scripts in TMR to compute the HumanML3D Guo features on the whole AMASS (+HumanAct12) dataset. Run the following command:

python -m prepare.compute_guoh3dfeats

It should process the features (+ mirrored version) and saved them in datasets/motions/guoh3dfeats. These features are common in TSLM because the annotation sources of BABEL and HumanML3D (Restore) datasets are from it.

Get motion-text annotations

BABEL dataset

❗ We cannot directly provide original data files to abide by the license.

Visit https://babel.is.tue.mpg.de/ to download BABEL dataset. At the time of experiment, we used babel_v1.0_release . BABEL dataset should be loacated at datasets/babel/babel_v1.0_release . File structures under datasets/babel/babel_v1.0_release looks like:

.
├── extra_train.json
├── extra_val.json
├── test.json
├── train.json
└── val.json

In order to follow the unified specification of AMASS annotations, we used AMASS-Annotation-Unifier to post-process these annotation data to obtain babel.json and babel_extra.json. The entire pipeline is embedded in this project. Please run the following command:

python prepare/babel_aau.py

HumanML3D (Restore) dataset

We are very grateful to the original HumanML3D for directly open-sourcing the annotation data. Here, we have provided the annotation data that conforms to the unified annotation specification of AMASS. Please see the file datasets/human3d/humanml3d.json for the new HumanML3D (Restore) dataset. This is the final valid version. As for the production process, please refer to the description in our paper.

Overall, the legal datasets folder structure is as follows:

datasets
├── babel
│   ├── babel_extra.json
│   ├── babel.json
│   └── splits
│       ├── test_extra.txt
│       ├── test.txt
│       ├── train_extra.txt
│       └── train.txt
├── humanml3d
│   ├── humanml3d.json
│   └── splits
│       ├── all.txt
│       ├── test.txt
│       ├── train.txt
│       └── val.txt
└── motions
│   ├── posedata
│   │   ├── ...
│   └── guoh3dfeats
│       ├── ...

3. Download text model dependencies

Download Roberta and MPNet from Hugging Face

cd deps/
git lfs install
git clone https://huggingface.co/roberta-base
git clone https://huggingface.co/sentence-transformers/all-mpnet-base-v2
cd ..

Make sure you have git-lfs installed on your device. The above steps are not absolute and can also be downloaded via huggingface-cli. The Roberta is used as a word embedding extractor, and MPNet calculates the similarity between texts and is used to measure false-negative moments.

How to train MLP

The command to start MLP training is as follows:

# train MLP on BABEL dataset
python train.py model=mlp data=babel model.beta=0.1
# train MLP on HumanML3D (Restore) dataset
python train.py model=mlp data=humanml3d model.beta=0.2

You can modify the model parameters in the 'config/model/mlp.yaml' file. When running for the first time, the .h5 file of the model input will be automatically generated, which may take 2-4 hours.

Similarly, to train MLPBase, please use the following command:

# train MLPBase on BABEL dataset
python train.py model=mlpbase data=babel
# train MLPbase on HumanML3D (Restore) dataset
python train.py model=mlpbase data=humanml3d

Evaluating MLP

Use the following command to evaluate mlp:

python eval.py folder=FOLER

The FOLDER can be replaced with the main experiment, such as outputs/babel/batchsize64/mlp/2024-02-29_17-34-36. After the evaluation, some visual results will be automatically generated in the 'FOLDER/qualitative' folder and top-5 'FOLDER/mlp/prediction' results in the prediction folder.

Citation

If you find this code to be useful for your research, please consider citing.

@article{yan2024mlp,
  title={MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions},
  author={Yan, Sheng and Liu, Mengyuan and Wang, Yong and Liu, Yang and Liu, Hong},
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
  year={2024},
  publisher={IEEE}
}

Acknowledgments

This code is standing on the shoulders of giants. We want to thank the following contributors that our code is based on:

VSLNet, LGI, HumanML3D

License

This code is distributed under an Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
datasets		datasets
deps		deps
mlp		mlp
prepare		prepare
visuals		visuals
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions

Description

Installation

1. Create conda environment

2. Download the datasets

Motion representation

Get motion-text annotations

BABEL dataset

HumanML3D (Restore) dataset

3. Download text model dependencies

Download Roberta and MPNet from Hugging Face

How to train MLP

Evaluating MLP

Citation

Acknowledgments

License

About

Uh oh!

Releases

Packages

Languages

License

eanson023/MLP

Folders and files

Latest commit

History

Repository files navigation

MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions

Description

Installation

1. Create conda environment

2. Download the datasets

Motion representation

Get motion-text annotations

BABEL dataset

HumanML3D (Restore) dataset

3. Download text model dependencies

Download Roberta and MPNet from Hugging Face

How to train MLP

Evaluating MLP

Citation

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages