Table 1: Statistics of data sets and value of the chosen parameter p.
Statistics of data sets size of
data sets samples (N) dim (d) # of classes dim-Red (p)
COIL20 1440 1024 20 20
ORL 400 4096 40 50
Reuters21578 8293 18933 65 80
TDT2 9394 36771 30 80
with labels was selected at random to form the train-
ing set. For ORL data, we randomly selected TN =
[2, 4, 6] samples per class for training. The rest of
samples were used for testing.
We set the regularized parameter µ = 0.5 and k =
p for fast approximate SVD on the assumption that
K − 1 6 k 6 p ≪ d. Table 1 shows for each data set
the dimension p that we chose for the intermediate
space. Since K − 1 directions can be generated by
LDA, we finally retain K − 1 vectors of W and then
classify the transformed data in the new space of di-
mension K − 1.
In order to access the relevance of the proposed
method appSVD+LDA, we have compared its perfor-
mance with three other methods which are listed be-
low :
• Direct LDA (DLDA) (Friedman, 1989) which
solves the LDA problem in the original space.
• LDA/QR (Ye and Li, 2004) which is a variant of
LDA that needs to solve the QR decomposition of
a small size matrix.
• NovRP (Liu and Chen, 2009) which is an ap-
proach that uses sparse random projection as di-
mension reduction before performing LDA. The
parameters µ and p have been set in the same way
as our approach.
4.3 Performance
The experimental results are given from Table 2 to 9
for all data sets highlight above. In these tables the
results are averaged over 20 random splits for each
TN(%) and report the mean as well as the standard
deviation. As the running time is nearly constant we
just report the mean value.
Tables 2 and 3 show the performance results on
COIL20 data. DLDA achieves the best accuracy in
this case whereas its running time is significantly the
highest. appSVD+LDA presents a quite good accu-
racy performance and its running time is nearly 100
times smaller than that of DLDA. The running time
of NovRP is the most efficient in this case whereas
its accuracy is the lowest one. For ORL data, exper-
imental results are displayed on Tables 4 and 5. As
can be seen, appSVD+LDA presents the best accu-
racy (for 4 and 6 samples) and a low running time.
As the dimension in this case is relatively large, the
computation time for DLDA is very large (see Table
5).
Reuters21578 and TDT2 are very large data sets.
As DLDA needs memory to store the centered and
scatter matrices in the original features space, it is
infeasible to apply DLDA in these cases. Tables 6
to 9 display only the performance results for NovRP,
LDA/QR and appSVD+LDA. The NovRP method
gives the most efficient time (see Tables 7 and 9)
whereas its accuracy is by far the lowest. It can be
seen that the chosen value of p is widely sufficient
for appSVD+LDA to recover nearly 86% of accuracy
for Reuters21578 and 95% for TDT2 and the compu-
tational time is quite small (see Tables 7 and 9). In
the whole results appSVD+LDA significantly outper-
forms LDA in running time and its accuracy perfor-
mance let believe in its effectiveness and efficiency
compared to other methods.
4.4 Parameter Tunning
There are three essential parameters in the proposed
method which are µ, p and k. µ is used for the reg-
ularization process of the scatter matrix. k is the
dimension of the new feature space where LDA is
performed. p is the dimension size of the interme-
diate subspace where the original features are ran-
domly mapped. A sensitive way of the proposed
appSVD+LDA is the choice of p. This parameter
should guarantee a minimum distortion between data
points after random map. In the final dimensional
space each point is represented as a k feature vector
that leads to a faster classification process. In our ex-
periments, we chose k = p. To illustrate the impact
of this parameter, we take various values of p. The
accuracy and the training time as a function of p av-
eraged over 20 random splits are plotted on figures
1 and 2. The methods DLDA and LDA/QR do not
depend on p contrary to appSVD+LDA and NovRP.
In figure 1 (right), as the training time of DLDA is
widely high, we have not plotted it. It can be seen that
the accuracyof the proposed method is good for small
values of p (p = 80) and it increases slowly with p