the highest average reduction rate. However, on the
other hand, LDIS has a lower accuracy when com-
pared with XGDIS. Regarding the reduction rate, no-
tice also that XGDIS achieves higher scores in com-
parison with GDIS and EGDIS, which were used for
inspiring the strategy adopted by XGDIS.
We also carried out experiments for evaluating
the impact of the parameter k in the performance of
XGDIS. The Table 4 represents the accuracy achieved
by the SVM classifier (with the standard parametriza-
tion of Weka 3.8) trained with the instances selected
by XGDIS algorithm in different datasets, while Table
5 represents the reduction rate achieved by XGDIS al-
gorithm in each dataset. In this experiment, we con-
sidered k assuming the values 2, 5, 10, 20. We also
considered the 10-fold cross validation schema in this
experiment.
Tables 4 and 5 suggest that the behavior of XGDIS
algorithm is not so sensitive to changes in the param-
eter k, since the values of accuracy and reduction are
very similar with different values of k. Besides that,
there is no clear pattern regarding the relationship be-
tween the parameter k and the accuracy and reduction.
This suggests that there is a complex interaction be-
tween the parameter k and the intrinsic properties of
each dataset.
We also carried out a comparison of the running
times of the prototype selection algorithms consid-
ered in our experiments. In this comparison, we ap-
plied the 9 prototype selection algorithms to reduce
the 3 biggest datasets considered in our tests: page-
blocks, optdigits and spambase. We adopted the same
parametrizations that were adopted in the first exper-
iment. We performed the experiments in an Intel
R
Core
TM
i5-5200U laptop with a 2.2 GHz CPU and
8 GB of RAM. The Figure 1 shows that, considering
these datasets, the LDIS algorithm has the lowest run-
ning times in all datasets. However, it is importante
to notice that GDIS, EGDIS and XGDIS algorithms
have reasonable running times when compared with
the other algorithms. And, besides that, these three
algorithms have a very similar running time.
In summary, the experiments show that XGDIS
presents a good balance between reduction rate, accu-
racy and running time. Besides that, the running time
of XGDIS is lower than the running time of classic al-
gorithms such as DROP3 and ICF, but is higher than
the running time of LDIS. However, it is important
to notice that XGDIS has the higher accuracy in most
of the datasets in comparison with LDIS. We com-
pared with GDIS and EGDIS, the XGDIS algorithm
achieved a similar accuracy and, higher reduction rate
and a similar running time. Thus, this suggests that
XGDIS can be viewed as an improvement over GDIS
and EGDIS.
6 CONCLUSION
In this paper, we proposed an algorithm for instance
selection, called XGDIS. It uses the notion of den-
sity for identifying instances that can represent a high
amount of information of the dataset. In summary,
the algorithm selects the instances whose density is
higher than the density of their neighbors, while re-
move instances that can be harmful for the classifica-
tion of novel instances.
The experiments show that XGDIS presents a
good balance between reduction rate, accuracy and
running time. Besides that, the algorithm can be
viewed as an improvement over the GDIS and EGDIS
algorithms, which were considered as a basis for the
development of XGDIS.
In future works, we plan to investigate how to im-
prove the performance of the XGDIS algorithm.
REFERENCES
Anwar, I. M., Salama, K. M., and Abdelbar, A. M. (2015).
Instance selection with ant colony optimization. Pro-
cedia Computer Science, 53:248–256.
Brighton, H. and Mellish, C. (2002). Advances in instance
selection for instance-based learning algorithms. Data
mining and knowledge discovery, 6(2):153–172.
Carbonera, J. L. (2017). An efficient approach for instance
selection. In International Conference on Big Data
Analytics and Knowledge Discovery, pages 228–243.
Springer.
Carbonera, J. L. and Abel, M. (2015). A density-based
approach for instance selection. In Tools with Arti-
ficial Intelligence (ICTAI), 2015 IEEE 27th Interna-
tional Conference on, pages 768–774. IEEE.
Carbonera, J. L. and Abel, M. (2016). A novel density-
based approach for instance selection. In 2016 IEEE
28th International Conference on Tools with Artificial
Intelligence (ICTAI), pages 549–556. IEEE.
Carbonera, J. L. and Abel, M. (2017). Efficient prototype
selection supported by subspace partitions. In 2017
IEEE 29th International Conference on Tools with Ar-
tificial Intelligence (ICTAI), pages 921–928. IEEE.
Carbonera, J. L. and Abel, M. (2018a). Efficient instance
selection based on spatial abstraction. In 2018 IEEE
30th international conference on tools with artificial
intelligence (ICTAI), pages 286–292. IEEE.
Carbonera, J. L. and Abel, M. (2018b). An efficient proto-
type selection algorithm based on dense spatial parti-
tions. In International Conference on Artificial Intelli-
gence and Soft Computing, pages 288–300. Springer.
Carbonera, J. L. and Abel, M. (2018c). An efficient proto-
type selection algorithm based on spatial abstraction.
ICEIS 2021 - 23rd International Conference on Enterprise Information Systems
408