Skip to content

(CVPR 2025) Official implementation of Paper ''Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks''

License

Notifications You must be signed in to change notification settings

iSEE-Laboratory/AMNAR

Repository files navigation

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

(CVPR 2025) Official implementation of Paper ''Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks''

arXiv version: Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

If you find this paper helpful, please cite our work:

@article{huang2025modeling,
  title={Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks},
  author={Huang, Wei-Jin and Li, Yuan-Ming and Xia, Zhi-Wei and Tang, Yu-Ming and Lin, Kun-Yu and Hu, Jian-Fang and Zheng, Wei-Shi},
  journal={arXiv preprint arXiv:2503.22405},
  year={2025}
}

1 Prepare and Process Data

Download Datasets from official release: EgoPER, HoloAssist, CaptainCook4D

EgoPER Dataset File Structure

EgoPER/
├── coffee
│   ├── features_10fps_dinov2
│   ├── features_10fps_new
│   ├── frames_10fps_new
│   ├── test.txt
│   ├── training.txt
│   ├── trim_start_end.txt
│   ├── trim_videos
│   └── validation.txt
├── oatmeal
│   ├── features_10fps_dinov2
│   ├── features_10fps_new
│   ├── frames_10fps_new
│   ├── test.txt
│   ├── training.txt
│   ├── trim_start_end.txt
│   ├── trim_videos
│   └── validation.txt
├── ......

2 Training

A training example of tea task in EgoPER

  • Train Action Segmentation Model (ActionFormer)
python train.py ./configs/EgoPER/tea_aod_online.yaml --output af200
  • Train Error Detection Model
python train.py ./configs/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200.yaml --resume ./ckpt/EgoPER/tea_aod_online_af200/epoch_205.pth.tar --output 1st 

3 Inference

An inference example of tea task in EgoPER

  • Use Action Segmentation Model to get the segmentation result
python test.py ./configs/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200.yaml ./ckpt/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200_1st
  • Use Error Detection Model to detect errors
python test_ed.py ./configs/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200.yaml ./ckpt/EgoPER/tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200_1st
  • Calculate Test Metrics
python metric_vis_multiprocess.py --task tea --dirname tea_aod_rebuild_ca_sigt_online_clusterCenter_it0.6_addedMax_norm_unfreeze_winlen32_dila3_dilaLayer5_fr5_cu10_cus0.40_m0.9_dp0.1_e200_1st/ -as -ed 

More examples can be found in run_sh_EgoPER/run_clusterCenter_it0.6_addedMax_ca_usenorm_unfreeze_winlen32_dilation3_dilalayers5_fr5_cu10_cus0.40_m0.9_dp0.1_e200.sh

Note

The implementation of AMNAR's code is based on ActionFormer and EgoPED. We recommend reading the code of these two works to help with understanding.

About

(CVPR 2025) Official implementation of Paper ''Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks''

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published