As high dimensional beacons, X
b
, are used as ref-
erence points, we do not use any particular selection
criterion to define them. Instead, we randomly choose
J instances from data set X to use as the beacon set
X
b
. In order to estimate the corresponding low dimen-
sional beacons, Y
b
, we define as y
1
b
(associated with
the x
1
b
) a zero vector, i.e., y
1
b
= [0...0] ∈ R
n
. Then we
proceed with the definition of the rest of the beacons
y
j
b
, j = 2, ..., J as follows: y
2
b
is estimated by minimiz-
ing φ(y
2
b
) = d
1
− δ
1
where d
1
is the distance between
x
2
b
and x
1
b
, and δ
1
is the distance between the candi-
date y
2
b
and y
1
b
. In accordance, y
3
b
is estimated by min-
imizing φ(y
3
b
) =
q
∑
j
(d
j
− δ
j
)
2
where d
j
, j = 1, 2 are
the respective distances between x
3
b
and x
1
b
, x
2
b
, and
δ
j
, j = 1, 2 are the respective distances between the
candidate y
3
b
and y
1
b
, y
2
b
. The rest of the Y
b
set is esti-
mated with the same procedure. When the whole Y
b
set is defined the rest of the X data set is mapped by
minimizing Eq. 4 with respect to the beacons Y
b
.
2.4 Implementation Issues
The PSO-DR algorithm was implemented in Mat-
lab 2019a. For the PSO algorithm the SwarmOps-
Numerical and Heuristic Optimization toolbox For
Matlab was used (toolbox available at: http://www.
hvass-labs.org/projects/swarmops/matlab/).
We ran all the experiments on a desktop PC, with
Intel Core(TM) i5-9600K at 3.70GHz, and 16 GB of
RAM.
The parameters concerning the PSO algorithm
was chosen according to the best parameters list pre-
sented in (Pedersen, 2010). In particular for the num-
ber of particles (swarm-size, P), number of iterations
of the PSO algorithm (stopping criterion), ω (iner-
tia weight), w
p
(particle’s-best weight), w
g
(swarm’s-
best weight) we used 25, 400, 0.3925, 2.5586, and
1.3358, respectively. These values were used for all
experiments conducted in this work.
For comparison reasons, other DR methods were
also used. In particular, PCA, tSNE, Isomap, Sam-
mon mapping, LLE, and Laplacian Eigenmaps were
compared with PSO-DR. For all these methods the
Matlab Toolbox for Dimensionality Reduction by
Laurens van der Maaten (Van Der Maaten et al., 2009)
was used. For each method the default values of the
parameters provided by the toolbox were used.
The number of beacons for each data set was de-
fined to be a quarter of the number of instances x
i
in each data set X, i.e., J =
1
4
M except for the cases
where J was larger than 1000; then we set J = 1000
irrespectively of the data set size.
3 EXPERIMENTS
The experiments conducted in this work are presented
here. We first describe the data sets used for DR and
subsequently elaborate on the experimental setup. Fi-
nally, we present the respective results.
3.1 Data Sets
Four different data sets were used to eval-
uate the performance of the PSO-DR algo-
rithm. In particular, the MNIST data set (The
MNIST data set is publicly available from
http://yann.lecun.com/exdb/mnist/index.html), the
COIL-20 data set (Nene et al., 1996), the FMNIST
data set (Xiao et al., 2017) and the Swiss Roll data
set with M = 3000.
The MNIST data set contains 60, 000, 28 × 28-
pixel (i.e., N = 784), grayscale images of hand writ-
ten digits (0, ..., 9). In this work we choose randomly
M = 6, 000 images (600 per class) to perform the
experiments. FMNIST data set has the same for-
mat as MNIST except that each class represents fash-
ion items. Again, for FMNIST we choose randomly
M = 6, 000 images (600 per class). The COIL-20 data
set, contains, 32 × 32 (i.e., N = 1, 024) images of 20
different objects which are viewed from 72 orienta-
tions, i.e., resulting in M = 1, 440 images.
3.2 Experimental Setup
For the MNIST, COIL-20, and FMNIST data sets we
use the PSO-DR, PCA, tSNE, Isomap, and Sammon
mapping techniques to transform the high dimen-
sional representations to a two-dimensional (n = 2)
map. For the Swiss Roll data set we use the PSO-DR,
PCA, Laplacian Eigenmaps, Isomap, and LLE tech-
niques to map to the 2-D space. We substituted tSNE
and Sammon mapping with Laplacian Eigenmaps and
LLE as techniques that do not employ neighborhood
graphs perform poorly on the Swiss Roll dataset (Van
Der Maaten et al., 2009).
The resulting maps in each one of the DR task
is shown as a scatter plot. The coloring in the scat-
ter plots is used to provide a way of evaluation for
the performance of the DR techniques. Moreover, for
each one of the DR methods the time needed to map
the respective data set is depicted.
For the proposed PSO-DR method, as soon as the
estimation of the beacons Y
b
is completed, the map-
ping of, e.g., a high dimensional input x
i
to the low di-
mensional space is independent of the mapping of any
other input x
j
, j 6= i. Thus, it is possible to map simul-
taneously multiple inputs. Hence, due to its straight-
A Dimensionality Reduction Method for Data Visualization using Particle Swarm Optimization
133