Skip to content

jinyang06/SamGOP

Repository files navigation

Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model

The pytorch implementation of "Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model"

Environment Preparation

Create conda environment

cd SamGOP
conda create --name samgop python=3.8 -y
conda activate samgop
conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=11.1 -c pytorch -c nvidia
pip install -U opencv-python

# under your working directory
cd detectron2
pip install -e .

Install Requirements

cd ..
pip install -r requirements.txt

CUDA kernel for MSDeformAttn

cd maskGOP/modeling/pixel_decoder/ops
sh make.sh

Data Preparation

We train our model on GOO-Real and GOO-Synth datasets respectively

You can download GOO-synth from OneDrive:

Train: part1, part2, part3, part4, part5, part6, part7, part8, part9, part10, part11

Test: GOOsynth-test_data

Annotation file:

GOOsynth-train_data_Annotation (Code:v4nx)

GOOsynth-test_data_Annotation (Code:ayqm)

You can download GOO-Real from OneDrive:

Train: GOOreal-train_data

Test: GOOreal-test_data

You can download GOO-Real annotations file from Baidu disk::

GOOreal-train_data_Annotation (code:2p89)

GOOreal-val_data_Annotation (code:p9f9)

If you want to train on GOO-Real or GOO-Synth dataset, please keep the data structure as follows:
├── datasets
   └── coco
      └── annotations
            └── cate.txt
            └── train2017.json
            └── val2017.json
      └── train2017
            ├── 0.png
            ├── 1.png
            ├── ...
      └── val2017
            ├── 3609.png  
            ├── 3610.png
            ├── ...

Training & Inference

To carry out experiments, please follow these commands:

python train_net.py --num-gpus 1 --config-file ./configs/coco/instance-segmentation/maskGOP_R50_bs2_75ep_3s.yaml SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0001

To eval the model, please follow these commands:

python eavl_train_net.py --eval-only --num-gpus 1 --config-file ./configs/coco/instance-segmentation/maskGOP_R50_bs2_75ep_3s.yaml MODEL.WEIGHTS weights_path

Model Weights

Download model weights from Baidu disk:

GOO-Synth_re-trained_model (code:2ma2)

GOO-Real_re-trained_model (code:24zt)

Acknowledgements

Our implamentation is based on detectron2 and maskdino

Citation

@inproceedings{jin2024boosting,
  title={Boosting Gaze Object Prediction via Pixel-Level Supervision from Vision Foundation Model},
  author={Jin, Yang and Zhang, Lei and Yan, Shi and Fan, Bin and Wang, Binglu},
  booktitle={European Conference on Computer Vision},
  pages={369--386},
  year={2024},
  organization={Springer}
}

About

"Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published