PSPNet is stand for Pyramid Scene Parsing Network
The Receptive Field (RF) in deep learning is defined as the size of the area in the input that creates the feature. It is essentially a measure of the relationship of an output feature (of any layer) with the input area (patch).

PPM is added to increase receptive field. Feature-maps will convolve with many kernels of different sizes.

Reducing effect from Vanishing Gradient Descent by computing the loss after res4b22 residue block.

Note: The img and mask have the same filename (include extension)! In this project:
-
input img size (wxh): 2048x1024
-
training img size (wxh): 401x401
-
no. of training img : 415 images
-
no. of testing img : 20 images
-
img source: cityscape
root_dir: + Default_Mask: | - null_img.png | + Model_Weights: | - resnet50_v2.pth | - 29th_modelPSPNet.pth dataset_dir: + IMG | - img_01.png | - img_02.png | - img_##.png + MASK | - img_01.png | - img_02.png | - img_##.png - trainval.txt - test.txt
The model was built for 21 classes (Labels), but in this project we only use four.
ID | LABEL's NAME |
---|---|
0 | VOID |
1 | DUONG_DI |
2 | LAN_HIEN_TAI |
3 | LAN_TRAI_0 |
4 | LAN_PHAI_0 |
5 | VOID |
... | VOID |
20 | VOID |
- Train loss = 0.05682641424238682
- Accuracy = 0.9867874002456665
- IoU = 0.9490965843200684
- Dice = 0.9733260941505432







