Skip to content

ngxx-fus/mlaipspnet_finalproj

Repository files navigation

PSPNet for lane segmentation

PSPNet architecture

PSPNet is stand for Pyramid Scene Parsing Network

Receptive feild

The Receptive Field (RF) in deep learning is defined as the size of the area in the input that creates the feature. It is essentially a measure of the relationship of an output feature (of any layer) with the input area (patch).



Pyramid Pooling Module (PPM)

PPM is added to increase receptive field. Feature-maps will convolve with many kernels of different sizes.



Auxiliary Loss

Reducing effect from Vanishing Gradient Descent by computing the loss after res4b22 residue block.



Dataset

Note: The img and mask have the same filename (include extension)! In this project:

  • input img size (wxh): 2048x1024

  • training img size (wxh): 401x401

  • no. of training img : 415 images

  • no. of testing img : 20 images

  • img source: cityscape

    root_dir:
    + Default_Mask:
    |    - null_img.png
    |
    + Model_Weights:
    |    -  resnet50_v2.pth 
    |    -  29th_modelPSPNet.pth 
    
    dataset_dir:
    + IMG
    |    -  img_01.png
    |    -  img_02.png
    |    -  img_##.png
    + MASK
    |    -  img_01.png
    |    -  img_02.png
    |    -  img_##.png
    - trainval.txt
    - test.txt
    

Labels

The model was built for 21 classes (Labels), but in this project we only use four.

ID LABEL's NAME
0 VOID
1 DUONG_DI
2 LAN_HIEN_TAI
3 LAN_TRAI_0
4 LAN_PHAI_0
5 VOID
... VOID
20 VOID

The result

In train_dataset (20 epoches):

image

  • Train loss = 0.05682641424238682
  • Accuracy = 0.9867874002456665
  • IoU = 0.9490965843200684
  • Dice = 0.9733260941505432

Review

















About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published