layers, a faster training is achieved. The method al-
lows training the CNN in a end to end manner for the
segmentation of input images of arbitrary sizes.
The U-Net architecture as proposed in (Olaf Ron-
neberger, 2015) was previously used in biomedical
image segmentation. The newly modified U-net i.e.
U-HardNet architecture as presented in the Section 4
allows combining low-level feature maps of a satel-
lite image with a higher-level, leading to precise lo-
calization. A large number of feature channels in up-
sampling part of the U-HardNet, allows the usage of
context information in higher resolution layers. The
method is inexpensive for semantic segmentation due
to less number of parameters, since there are no fully
connected layers and demonstrates the applicability
of deep learning techniques for segmentation.
The paper is organized as follows. In section 2,
details regarding Multispectral images are explained
and section 3, highlights details about Data set pro-
vided by DSTL. Section 4 discusses, in detail, about
the proposed method for semantic segmentation in-
volving image fusion and the Hard-Swish activation
function. It also discusses the modified U-HardNet
for segmentation and its training process. Experimen-
tal studies are discussed in section 5. section 6 con-
cludes the paper where some future directions of the
research is also given.
2 MULTISPECTRAL BANDS
In satellite imagery there are two sorts of images:
• Multispectral Images: A multispectral image is
a collection of several monochrome images of the
same physical area with a defined scale but in al-
ternate spectral bands which is procured with a
different sensors.
• Panchromatic Images: A panchromatic image is
rendered in black and white which is obtained in
a wide visual wavelength.
Multispectral Band of the images enables to extract
important features which is used for recognition
of specific classes of object that is beyond human
vision. For instance, the near infrared wavelength is
typically used to isolate vegetation assortments and
conditions due to strong reflection in this range of
electromagnetic spectrum that vegetation provides.
Besides, the color depth of images is 11-bit and
14-bit instead of commonly used 8-bit. Viewing from
perspective of a neural network, increase in number
of bits is better because each pixel carries more in-
formation, which creates additional steps for proper
visualization.
Details of multispectral bands which are used for re-
cognition of specific classes of object in DSTL dataset
is discussed below.
• Coastal (400-452 nm): This band detects pro-
found blues and violets. It’s primary use is for
imaging shallow water, and tracking fine particles
like dust and smoke.
• Blue (448-510 nm): This band detects ordinary
blues and it provides details regarding increased
penetration of water bodies by identifying depths
of nearly 150 feet and is equipped for separating
soil and rock surfaces from vegetation.
• Green (518-586 nm): This band detects greens
and was used for isolating the vegetation from soil
by detecting the green reflectance crest of leaf sur-
faces. In this band, streets and highways of urban
regions have showed up as brighter tone compa-
red to forest and vegetation’s dull tone (Mnih V.,
2010).
• Yellow (590-630 nm): This band senses in the
solid chlorophyll absorption region and strong re-
flectance areas for identifying soils. It was used
for isolation of vegetation and soil. This band has
highlighted desolate grounds, urban zones, road
design in the urban territory and expressways.
• NIR (772-954 nm): This band measures the near
infrared. Data from this band is imperative for real
reflectance records, for example, Normalized Dif-
ference Vegetation Index (NDVI) (Jia.Y, 2014),
which allows to measure specific characteristics
like of vegetation more precisely.
• SWIR (1195-2365 nm): This band covers diverse
cuts of the shortwave infrared. They are especi-
ally helpful for differentiating wet earth from dry
earth.
3 DATA SET DESCRIPTION
Organization named Defence Science and Techno-
logy Laboratory (DSTL) provides the data in both
3-band and 16-band of 1km x 1km satellite ima-
gery. The traditional RGB natural color images
are obtained as 3-band images. The 16-band ima-
ges contain spectral information by catching more
extensive wavelength channels. MultiSpectral (400
1040nm) range and Short-Wave infrared (SWIR)
(1195 - 2365nm) range are used to obtain the multi-
band imagery.
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications
414