Abstract - As a cutting-edge deep encoder-decoder architecture, DeepLabV3+ has been realized as a state-of-the-art solution for image segmentation. Furthermore, DeepLabV3+ has great potential for semantic segmentation of aerial images captured by unmanned aerial vehicles (UAVs) for aerial and remote sensing applications. This is thanks to an Atrous Spatial Pyramid Pooling (ASPP) block deployed in its encoder with multiple atrous convolutional layers to enrich diversified feature extraction and learning efficiency. However, the DeepLabV3 + encoder-decoder architecture has some limitations, including the lack of information during the upsampling process and some inappropriate customizations that cause incorrect segmentation. To address these shortcomings, we introduce an efficient architecture with a novel Feature Aggregation Network (FAN), which facilitates the extraction of features across multiple scales and stages. Concurrently, we apply some adaptive upgrades to the ASPP block, involving a new set of dilation factors that are adept at accommodating low-resolution inputs.
As a result, our improved remote sensing segmentation model achieves significant performance gains when evaluated on a real-world data set: Global precision improves by at least
We provide the source code of this work (DeepLabV3+ improvements with an adaptive dilation scheme and feature aggregation netwokr) in Matlab, including:
- training_test_program.m: this file is used to train and test performance of deep semantic segmentation model
- lgraph_deeplabv3+_resnet50.mat: this contains the architecture of deep network with trained backbone Resnet50. This is used for training stage in training_test_program.m in the case of training = 1.
- trained_deeplabv3+_resnet50.mat: this contains the deep model that is already trained on Vaihingen using backbone Resnet50. This is just used for evaluation stage.
This source code allows you to reproduce segmentation results on various compatible datasets. The network input is set to a resolution of
The work of this code is currently revised for considering publication on IGRSL. If there is any error or need to be discussed, please email to Thien Huynh-The via thienht@hcmute.edu.vn.