2 CLASSIFICATION OF THE
IMAGES
2.1 Multilayer Feedforward Network
Firstly, the network architecture used in the experi-
ments performed in this paper is the Multilayer Feed-
forward Network, henceforth called MF network.
This kind of networks consists of three layers of
computational units. The neurons of the first layer
apply the identity function whereas the neurons of the
second and third layers apply the sigmoid function.
This network can approximate any function with a
specified precision (Bishop, 1995; Kuncheva, 2004).
In the case of the application described in this pa-
per, the networks have been trained for a few itera-
tions. In each iteration, the weights of the networks
have been adapted with the Backpropagation algo-
rithm by using all the patterns from the training set,
T. At the end of the iteration the Mean Square Error,
MSE, has been calculated by classifying all the pat-
terns from the the Validation set, V. When the learn-
ing process has finished, the weights of the iteration
with lowest MSE of the validation set are assigned to
the final network. The learning process is described
in Algorithm 1.
Algorithm 1: MF Network Training{T ,V}.
Set initial weights randomly
for e = 1 to epochs do
for i = 1 to N
patterns
do
Select pattern x
i
from T
Adjust the trainable parameters
end for
Calculate MSE over validation set V
Save epoch weights and calculated MSE
end for
Select epoch with lowest error
To perform the experiments, the original dataset
has been divided into three different subsets. The first
set is the training set, T, which is used to adapt the
weights of the networks (64% of total patterns). The
second set is validation set, V, which is used to se-
lect the final network configuration (16% of total pat-
terns). Finally, the last set is the test set, TS, which
is applied to obtain the accuracy of the network (20%
of total patterns). The original learning set, L, refers
to the training and validation sets, the sets which are
involved on the learning procedure.
2.2 The Ensemble of Neural Networks
The process of designing an ensemble of neural net-
works consists of two main steps. In the first step,
the development of the ensemble, the networks are
trained according to the specifications of the ensem-
ble method. The second step, the determination of the
suitable combiner, focuses in selecting the most accu-
rate combiner for the generated ensemble.
As has been previously described, the learning
process of an artificial neural network is based on
minimizing a target function. A simple procedure to
increase the diversity of the classifier consists in using
several neural networks with different initial values of
the trainable parameters. Once the initial configura-
tion is randomly set, the networks can be trained as a
single network. With this ensemble method, known
as Simple Ensemble, the networks converge into dif-
ferent final configurations(Dietterich, 2000) therefore
diversity and performance of the system can increase.
Its description is shown in Algorithm 2.
Algorithm 2: Simple Ensemble {T ,V N
networks
}.
Generate N
networks
different seed values: seed
i
for i = 1 to N
networks
do
Random Generator Seed = seed
i
Original Network Training {T , V}
end for
Save Ensemble Configuration
Finally, the output of the networks are averaged
in order to get the final output of the whole system.
This way to combine an ensemble is known as Output
average or Ensemble Averaging.
2.3 Codification of the Problem
To perform the classification task, some wavelet fea-
tures from the image will be processed by the en-
semble according to (Sung et al., 2010). To deter-
mine the classification, the image is divided into NxM
blocks, instead of working directly with the pixels.
Concretely, the two-level Daubechies wavelet trans-
form - ‘Daub2’ is applied to each HSI channel of the
image. Then, the features Mean and Energy of the
wavelet sub-bands are calculated for each HSI chan-
nel and image block. With this procedure, 14 features
can be extracted for each HSI channel. Some of them
can not be used due to noise.
In the system, a pattern is represented by a 26-
dimensional vector (8 wavelet features for each HSI
channel and the 2 spatial coordinates) which is pre-
sented to the network in order to classify the origi-
nal image block by block. This vector is successfully
used to classify images of orange groves. However,
since the structure of orange groves is very delimited,
probably the number of features could be reduced and
the computational cost of the classification could be
lower, but this aspect has not been tested yet.
ICINCO 2011 - 8th International Conference on Informatics in Control, Automation and Robotics
224