Skip to content

This repository contains the official implementation of P2AT, a novel architecture designed for real-time semantic segmentation. P2AT achieves trade-off between accuracy and speed, establishing state-of-the-art results on Cityscapes and CamVid (pretrained on Cityscapes) without relying on inference acceleration techniques.

License

Notifications You must be signed in to change notification settings

mohamedac29/P2AT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

P2AT: Pyramid Pooling Axial Transformer for Real-time Semantic Segmentation [Arxiv] | ESWA ESWA

PWC

You need to download the Cityscapes datasets. and rename the folder cityscapes, then put the data under data folder.

Clone this repository

git clone https://github.com/mohamedac29/P2AT
cd P2AT

Datasets Preparation

1. Cavmvid Dataset

You can download the Camvid dataset from Kaggle

Citysscapes Dataset

  • You need to download the Cityscapes datasets, unzip them and put the files in the data folder with following structure.
$SEG_ROOT/data\ 
├── Camvid
│       ├── images
│       ├── labels

│    ├── cityscapes
│        ├── gtFine
│            ├── test
│            ├── train
│            ├── val
│     ── ── leftImg8bit
│             ├── test
│             ├── train
│             └── val
│    ├── list
         ├── Camvid
│          ├── test.lst
│          ├── train.lst
│          ├── trainval.lst
│          └── val.lst
│       ├── cityscapes
│          ├── test.lst
│          ├── train.lst
│          ├── trainval.lst
│          └── val.lst
   

Training

Training on Camvid datsaset

  • For instance, train the P2AT-S on Camvid dataset with batch size of 8 on 2 GPUs:
python tools/train.py --cfg configs/camvid/p2at_small_camvid.yaml GPUS (0,1) TRAIN.BATCH_SIZE_PER_GPU 4
  • To evaluate the P2AT-S on Camvid set:
python tools/eval.py --cfg configs/camvid/p2at_small_camvid.yaml \
                          TEST.MODEL_FILE checkpoints/camvid/p2at_small_Camvid.pth \
                          DATASET.TEST_SET list/camvid/test.lst

Citation

If you find this work useful in your research, please consider citing.

@article{elhassan2024p2at,
  title={P2AT: Pyramid pooling axial transformer for real-time semantic segmentation},
  author={Elhassan, Mohammed AM and Zhou, Changjun and Benabid, Amina and Adam, Abuzar BM},
  journal={Expert Systems with Applications},
  volume={255},
  pages={124610},
  year={2024},
  publisher={Elsevier}
}
@article{elhassan2024csnet,
  title={CSNet: Cross-Stage Subtraction Network for Real-Time Semantic Segmentation in Autonomous Driving},
  author={Elhassan, Mohammed AM and Zhou, Changjun and Zhu, Donglin and Adam, Abuzar BM and Benabid, Amina and Khan, Ali and Mehmood, Atif and Zhang, Jun and Jin, Hu and Jeon, Sang-Woon},
  journal={IEEE Transactions on Intelligent Transportation Systems},
  year={2024},
  publisher={IEEE}
}

About

This repository contains the official implementation of P2AT, a novel architecture designed for real-time semantic segmentation. P2AT achieves trade-off between accuracy and speed, establishing state-of-the-art results on Cityscapes and CamVid (pretrained on Cityscapes) without relying on inference acceleration techniques.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages