DeepLabV3-FAN

Abstract - As a cutting-edge deep encoder-decoder architecture, DeepLabV3+ has been realized as a state-of-the-art solution for image segmentation. Furthermore, DeepLabV3+ has great potential for semantic segmentation of aerial images captured by unmanned aerial vehicles (UAVs) for aerial and remote sensing applications. This is thanks to an Atrous Spatial Pyramid Pooling (ASPP) block deployed in its encoder with multiple atrous convolutional layers to enrich diversified feature extraction and learning efficiency. However, the DeepLabV3 + encoder-decoder architecture has some limitations, including the lack of information during the upsampling process and some inappropriate customizations that cause incorrect segmentation. To address these shortcomings, we introduce an efficient architecture with a novel Feature Aggregation Network (FAN), which facilitates the extraction of features across multiple scales and stages. Concurrently, we apply some adaptive upgrades to the ASPP block, involving a new set of dilation factors that are adept at accommodating low-resolution inputs. As a result, our improved remote sensing segmentation model achieves significant performance gains when evaluated on a real-world data set: Global precision improves by at least $5.39%$, mean intersection-over-union (IoU) increases by $10.97%$, and mean boundary-F1 score (BFScore) improves by $11.3%$. These advances lead to a more precise identification of urban classes, resulting in a greater precision in the segmentation task.

We provide the source code of this work (DeepLabV3+ improvements with an adaptive dilation scheme and feature aggregation netwokr) in Matlab, including:

training_test_program.m: this file is used to train and test performance of deep semantic segmentation model
lgraph_deeplabv3+_resnet50.mat: this contains the architecture of deep network with trained backbone Resnet50. This is used for training stage in training_test_program.m in the case of training = 1.
trained_deeplabv3+_resnet50.mat: this contains the deep model that is already trained on Vaihingen using backbone Resnet50. This is just used for evaluation stage.

This source code allows you to reproduce segmentation results on various compatible datasets. The network input is set to a resolution of $512 \times 512$ pixels.

The work of this code is currently revised for considering publication on IGRSL. If there is any error or need to be discussed, please email to Thien Huynh-The via thienht@hcmute.edu.vn.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md
deeplabv3+FAN.png		deeplabv3+FAN.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepLabV3-FAN

About

Uh oh!

Releases

Packages

ThienHuynhThe/DeepLabV3-FAN

Folders and files

Latest commit

History

Repository files navigation

DeepLabV3-FAN

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages