(CVPR 2025) Official implementation of Paper ''Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks''
arXiv version: Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
If you find this paper helpful, please cite our work:
@article{huang2025modeling,
title={Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks},
author={Huang, Wei-Jin and Li, Yuan-Ming and Xia, Zhi-Wei and Tang, Yu-Ming and Lin, Kun-Yu and Hu, Jian-Fang and Zheng, Wei-Shi},
journal={arXiv preprint arXiv:2503.22405},
year={2025}
}
Download Datasets from official release: EgoPER, HoloAssist, CaptainCook4D
EgoPER Dataset File Structure
EgoPER/
├── coffee
│ ├── features_10fps_dinov2
│ ├── features_10fps_new
│ ├── frames_10fps_new
│ ├── test.txt
│ ├── training.txt
│ ├── trim_start_end.txt
│ ├── trim_videos
│ └── validation.txt
├── oatmeal
│ ├── features_10fps_dinov2
│ ├── features_10fps_new
│ ├── frames_10fps_new
│ ├── test.txt
│ ├── training.txt
│ ├── trim_start_end.txt
│ ├── trim_videos
│ └── validation.txt
├── ......
A training example of tea
task in EgoPER
- Train Action Segmentation Model (ActionFormer)
python train.py ./configs/EgoPER/tea_aod_online.yaml --output af200
- Train Error Detection Model
python train.py ./configs/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200.yaml --resume ./ckpt/EgoPER/tea_aod_online_af200/epoch_205.pth.tar --output 1st
An inference example of tea
task in EgoPER
- Use Action Segmentation Model to get the segmentation result
python test.py ./configs/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200.yaml ./ckpt/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200_1st
- Use Error Detection Model to detect errors
python test_ed.py ./configs/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200.yaml ./ckpt/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200_1st
- Calculate Test Metrics
python metric_vis_multiprocess.py --task tea --dirname tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200_1st/ -as -ed
More examples can be found in run_sh_EgoPER/run_clusterCenter_it0.6_addedMax_ca_usenorm_unfreeze_winlen32_dilation3_dilalayers5_fr5_cu10_cus0.40_m0.9_dp0.1_e200.sh
The implementation of AMNAR's code is based on ActionFormer and EgoPED. We recommend reading the code of these two works to help with understanding.