Authors:
Kilho Shin
1
;
Kenta Okumoto
2
;
David Lowrence Shepard
3
;
Tetsuji Kuboyama
1
;
Takako Hashimoto
4
and
Hiroaki Ohshima
5
Affiliations:
1
Gakushuin University, Tokyo, Japan
;
2
Japan Post Bank, Tokyo, Japan
;
3
UCLA, Scholary Innovation Lab., CA, U.S.A.
;
4
Chiba University of Commerce, Chiba, Japan
;
5
University of Hyogo, Kobe, Japan
Keyword(s):
Unsupervised Learning, Feature Selection.
Abstract:
The difficulty of unsupervised feature selection results from the fact that many local solutions can exist simultaneously in the same dataset. No objective measure exists for judging the appropriateness of a particular local solution, because every local solution may reflect some meaningful but different interpretation of the dataset. On the other hand, known accurate feature selection algorithms perform slowly, which limits the number of local solutions that can be obtained using these algorithms. They have a small chance of producing a feature set that can explain the phenomenon being studied. This paper presents a new method for searching many local solutions using a significantly fast and accurate algorithm. In fact, our feature value selection algorithm (UFVS) requires only a few tens of milliseconds for datasets with thousands of features and instances, and includes a parameter that can change the local solutions to select. It changes the scale of the problem, allowing a user t
o try many different solutions and pick the best one. In experiments with labeled datasets, UFVS found feature value sets that explain the labels, and also, with different parameter values, it detected relationships between feature value sets that did not line up with the given labels.
(More)