2 NEURAL NETWORKS
A Neural Network (NN) is an information-
processing paradigm inspired by the way biological
nervous systems, such as the brain, process
information. Neural networks are made up of a
number of artificial neurons. An artificial neuron is
simply an electronically modeled biological neuron.
How many neurons are used depends on the problem
we are trying to solve. Figure 1 represents a picture
of a neuron in a neural network. Each neuron accepts
a weighted set of inputs and responds with an output.
W1
W2
W3
W4
Single Node
Inputs
and
Weights
Summation
and
Activation
Function
Output
Value
Figure 1: A neuron in Neural Network
The real power of neural networks comes when
we combine neurons in multi-layer structures.
Figure 2 represents a sample neural network. The
number of nodes in the input layer corresponds to
the number of inputs and the number of nodes in the
output layer corresponds to the number of outputs
produced by the neural network. When the network
is used, the input variable values are placed in the
input units, and then the hidden and output layer
units are progressively executed. Each of them
calculates its activation value by taking the weighted
sum of the outputs of the units in the preceding
layer, and subtracting the threshold. The activation
value is passed through the activation function to
produce the output of the neuron. When the entire
network has been executed, the outputs of the output
layer act as the output of the entire network.
Once the number of layers and number of units
in each layer has been selected, the network's
weights and thresholds must be set so as to minimize
the prediction error made by the network. This is the
role of the training algorithms. The error of a
particular configuration of the network can be
determined by running all the training cases through
the network, comparing the actual output generated
with the desired or target outputs. The differences
are combined together by an error function to give
the network error.
NEURONS
INPUT LAYER1 LAYER2 OUTPUT
Figure 2: Multi-layer Neural Network
3 SUPPORT VECTOR MACHINES
The support vector machine (SVM) algorithm
(Boser et al., 1992; Vapnik, 1998) is a classification
algorithm that has received a great consideration
because of its astonishing performance in a wide
variety of application domains such as handwriting
recognition, object recognition, speaker
identification, face detection and text categorization
(Cristianini and Shawe-Taylor, 2000). Generally,
SVM is useful for pattern recognition in complex
datasets. It usually solves the classification problem
by learning from examples.
During the past few years, the support vector
machine-learning algorithm has been broadly
applied within the area of bioinformatics. The
algorithm has been used to detect new unknown
patterns within and among biological sequences,
which help to classify genes and patients based on
gene expression, and has recently been used in
several advance biological problems. There are two
main motivations that suggest the use of SVM in
bioinformatics. First, many biological problems
involve high-dimensional, noisy data, and the
difficulty of a learning problem increases
exponentially with dimension. It has been a common
practice to use dimensionality reduction to relief this
problem. SVMs use a different technique, based on
margin maximization, to cope with high dimensional
problems. Empirically, they have been shown to
work in high dimensional spaces with remarkable
performance. In fact, rather than reducing
dimensionality as suggested by Duda and Hart, the
SVM increases the dimension of the feature space.
The SVM computes a simple linear classifier, after
mapping the original problem into a much higher
dimension space using a non-linear kernel function.
In order to control over fitting in this extremely
high-dimensional space, the SVM attempts to
maximize the margin characterized by the distance
between the nearest training point and the separating
discriminant.
Second, in contrast to most machine learning
methods, SVMs can easily handle non-vector inputs,
COMBINING NEURAL NETWORK AND SUPPORT VECTOR MACHINE INTO INTEGRATED APPROACH FOR
BIODATA MINING
183