from the FCN(Fully Convolutional Network) as the
classifier inputs. In contrast Duan et al. (Y.Duan,
2017) proposed an approach in which CNN pooling
layer is substituted with a wavelet constrained pool-
ing layer. This layer is used in conjunction with
Markov Random Field and superpixel in order to pro-
vide a segmentation map. In (J. Geng and Chen,
2015) Geng et al. Used deep convolutional autoen-
coders (DCAE) for extracting features and automatic
classification on high resolution single polarization
TerraSAR-X images. The architectures of the DCAE
contains a convolutional hand-crafted first layer, in
which there are kernels, and a scale transformation
hand-crafted second layer, in which the correlated
neighbor pixel is integrated. The other layers are
trained with SAE (Stacked autoencoder). In fact, Xi-
aorui Ma et al (X.Ma, 2017) proposed a classifica-
tion approach based on three decisions: the first de-
cision is a local decision. A hyperspectral image will
be sampled and the test sample is based on its neigh-
borhood by calculating the Euclidean distance. The
second decision, is a global decision based on a su-
pervised classification. It calculates an Euclidean dis-
tance between the sample and the classes. The final
decision, is a self-decision. It is based on the label
class involving spectral and spatial features. The first
two decisions are applied to unlabeled samples in the
training set. After that, the deep network is trained on
the new training set to extract features and a classifi-
cation map is generated from the self-decision. Ob-
viously, the most common challenge of RS applica-
tions is RS image classification. In fact, RS images
can have similar appearance but it belongs to differ-
ent classes. Indeed, in the few recent years the DL
approaches comes as a solution to this challenge. DL
is proving that it has efficient results in hyperspectral
and multispectral BRSD imagery in land cover types
such as extracting forests, buildings, roads.
As the DL approaches are taking off big data and re-
mote sensing. In our paper, we are going to use DL
in BRSD classification by identifying and classify-
ing objects in satellite images and executing DL algo-
rithms based on two CNN models (vggnet and U-net).
From the state of the art, the vggnet architec-
ture appears as the network devoted to feature extrac-
tion tasks. This network receives an error of about
8.5% on the ImageNet Large Scale Visual Recogni-
tion Competition (ILSVRC). This is about 1% more
than the 19-layer version, but in the interest of eas-
ier handling and computation speed. The vggnet
was chosen over Alexnet and other architectures for
its simplicity, uniform 3x3 convolutions and depth,
which gives the power to exploit more general fea-
tures. Vggnet was chosen on Resnet, with about 3.5%
on the ILSVRC once again in the interest of simplic-
ity and computational flexibility. The vggnet network
was formed (by the original authors) on Imagenet’s
well-known ILSVRC-2012 dataset (J. Deng, 2010),
consisting of 1.3 million images, distributed in 1000
classes, which makes it a good features extractor.
According to the study of art, DeepUnet and U-net
model is the most suitable for the processing of mul-
tispectral satellite images so we chose to work with
U-net for the multispectral images segmentation and
classification.
We remember that our aim in this study is to char-
acterize image objects given their labels. We con-
sider this task as object extraction from satellite image
datasets.
Furthermore, we work with four databases com-
posed of heterogeneous satellite images like RGB im-
ages, multispctral images, multi-band images with
different sizes and resolutions in order to apply two
dimensions of big data such as Volume and Variety.
3 PROPOSED APPROACH
BRSD (big remote sensing data)represents a chal-
lenge for the DL. In fact, BD involves large number
of samples as inputs, large varieties of classes as out-
puts and high large dimensionality as attributes This
will lead to high running time and model complexi-
ties. For all these reasons algorithms with distributed
framework and parallelized machines are required.
From Hubel and Wiesel’s point of view exposed in
their early work on the cat’s visual cortex (D.Hubel,
1968),we approve that the visual cortex contains a
complex arrangement of cells. These cells are sen-
sitive to small local regions of the visual field, named
a receptive field. The local regions are tiled to hedge
the whole visual field area. These cells act as local
filters and are suitable to exploit the strong spatially
local correlation present in images.
Based on this assessment, we proposed our ap-
proach in three steps as following in Figure 1. First we
propose to use Resilient Distributed Datasets (RDD)
structure of Spark wherein each image is considered
as a dataset element. Hence each image is represented
by a vector of RDDs records reflecting a local region
of the image source. When the RDDs vectors are es-
tablished, they are forwarded to the deep neural net-
work inputs wherein the number of networks is equal
to each vector dimension. At this second step, we
are looking for local-object identification and image
segmentation by using a pre-trained Vggnet network.
At the third step we pipeline the Vggnet outputs to a
random forest and KNN machine learning with ML-
Deep Learning Analysis for Big Remote Sensing Image Classification
357