investigated signal. The signal features corresponding to the new coordinate system
are uncorrelated, that is, in case of normal models these components are independent.
The advantages of using principal components reside from the fact that bands are
uncorrelated and no information contained in one band can be predicted by the
knowledge of the other bands, therefore the information contained by each band is
maximum for the whole set of bits [3].
Principal components analysis seeks to explain the correlation structure of a set of
predictor variables using a smaller set of linear combinations of these variables. The
total variability of a data set produced by the complete set of n variables can often be
accounted for primarily by a smaller set of m linear combinations of these variables,
which would mean that there is almost as much information in the m components as
there is in the original n variables. The principal components represent a new
coordinate system, found by rotating the original system along the directions of
maximum variability [7].
Classical PCA is based on the second-order statistics of the data and, in particular,
on the eigen-structure of the data covariance matrix and accordingly, the PCA neural
models incorporate only cells with linear activation functions. More recently, several
generalizations of the classical PCA models to non-Gaussian models, the Independent
Component Analysis (ICA) and the Blind Source Separation techniques (BSS) have
become a very attractive and promising framework in developing more efficient
image restoration algorithms [8].
In unsupervised classification, the classes are not known at the start of the process.
The number of classes, their defining features and their objects have to be determined.
The unsupervised classification can be viewed as a process of seeking valid
summaries of data comprising classes of similar objects such that the resulted classes
are well separated in the sense that objects are not only similar to other objects
belonging to the same class, but also significantly different from objects in another
classes. Occasionally, the summaries of a data set are expected to be relevant for
describing a large collection of objects and allowing predictions or to discover
hypotheses on the inner structures in the data.
Since similarity plays a key role for both clustering and classification purposes, the
problem of finding relevant indicators to measure the similarity between two patterns
drawn from the same feature space became of major importance. Recently, alternative
methods as discriminant common vectors, neighborhood components analysis and
Laplacianfaces have been proposed allowing the learning of linear projection matrices
for dimensionality reduction [4], [10].
The aims of the research reported in this paper are to investigate the potential of
principal directions-based approach in supervised and unsupervised frameworks. The
structure of a class is represented in terms of the estimates of its principal directions
computed from data, the overall dissimilarity of a particular object with a given class
being given by the “disturbance” of the structure, when the object is identified as a
member of this class. In case of unsupervised framework, the clusters are computed
using the estimates of the principal directions. Our attempt uses arguments based on
the principal components to refine the basic idea of k-means aiming to assure
soundness and homogeneity to the resulted clusters. The clusters are represented in
terms of skeletons given by sets of orthogonal and unit eigen vectors (principal
directions) of each cluster sample covariance matrix. According to the well known
result established by Karhunen and Loeve, a set of principal directions corresponds to
62