The first synthetic set (hereafter referred to as
synth1) is a variation on the Trunk data set used in
(Law et al., 2004)), and was designed for its 10 fea-
tures to be in decreasing order of relevance. It con-
sists of data sampled from two Gaussians N (µ
1
,I) and
N (µ
2
,I), where
µ
1
= 1,
1
√
3
,... ,
1
√
2d−1
,... ,
1
√
19
and µ
1
= −µ
2
. We hypothesize (H1) that the feature
relevance ranking estimated by FRD-GTM for these
data will deteriorate gradually as noise is added to the
10 original features and in proportion to its level. In
order to test H1, four increasing levels of Gaussian
noise, of standard deviations 0.1, 0.2, 0.5, and 1, were
added to the 10 original features of synth1, for a given
sample size. It is also hypothesized (H2) that the fea-
ture relevance ranking will deteriorate as we add new
noisy features and in proportion to their level of noise.
In order to test H2, 5 and 10 dummy features consist-
ing of Gaussian noise of standard deviations 0.1, 0.2,
0.5, and 1, were, in turn, added to the 10 original fea-
tures.
The second dataset (hereafter referred to as
synth2) consists of two features defining four neatly
separated Gaussian clusters with centres located at
(0,3),(1, 9),(6,4) and (7, 10); they are meant to be
relatively relevant in contrast to any added noise. In
a first experiment, noise of different levels was added
to the first two features, while 4 extra noise features
were added to those two. Several other experiments,
similar to the ones devised for synth1 were designed
to further test H2.
The FRD-GTM parameters W and w
0
were ini-
tialized with small random values sampled from a
normal distribution. Saliencies were initialized at
ρ
d
= 0.5,∀d,d = 1,. . .,D. The grid of GTM latent
centres was fixed to a square layout of 3 ×3 nodes
(i.e., 9 constrained mixture components). The cor-
responding grid of basis functions φ
m
was fixed to a
2×2 layout.
4 EXPERIMENTAL RESULTS
AND DISCUSSION
The experiments outlined in the previous section aim
to assess the effect of the presence of uninformative
noise on the performance of FRD-GTM in the process
of unsupervised feature relevance estimation.
In the experiments reported in Figure 1, four lev-
els of Gaussian noise of increasing level were added
to a sample of 1,000 points of synth1. The FRD-GTM
is shown to behave robustly even in the presence of a
substantial amount of noise, although its performance
deteriorates significantly for noise of standard devi-
ation = 1, as reflected in the breach of the expected
monotonic decrease of the mean feature saliencies. It
is also true that, comparing these results with those in
Figure 2 (in which no noise was added to synth1), the
most relevant feature is not so close to a saliency of 1.
H1 is, therefore, partially supported by these results.
The FRD ranking results for the second experi-
ment, using the 10 original features of synth1 plus 5
Gaussian noise features, are shown in Figure 2. For
all levels of noise, the relevance (in the form of esti-
mated saliency) of the original features (1 → 10) is
reasonably well estimated: the saliency for the first
feature is close to 1 with almost full certainty (very
small vertical bars) and, overall, the expected mono-
tonic decrease of the mean feature saliencies is pre-
served, although breaches of such monotonicity can
also be observed. The saliencies estimated for the 5
added Gaussian noise features are regularly estimated
to be small. Interestingly, the increase in the level of
noise does not seem to affect the performance of the
FRD method in any significant way: the differences
between the saliencies of the 10 original variables and
the 5 noisy ones stay roughly the same and the de-
creasing relevance for the 10 original variables does
not vary substantially. According to these results, H2
is not supported at this stage.
The FRD ranking results for the third experiment,
using the 10 original features of synth1 plus 10 Gaus-
sian noise features are shown in Figure 3. Once again,
and for all levels of noise, the relevance of the 10 orig-
inal features shows, overall, the expected monotonic
decrease of the mean feature saliencies, with some
breaches of monotonicity. This time, the saliencies
estimated for the 10 added Gaussian noise features
are not that clearly small in comparison to those esti-
mated for the 10 original ones. In summary, the de-
creasing relevance for the 10 original variables does
not vary substantially, and the differences between
the saliencies of the 10 original features and the 5
noisy ones stay roughly the same regardless the noise
level. Nevertheless, the FRD method seems to be af-
fected by the increase in number of the noisy features.
According to these results, H2 is only partially sup-
ported.
The FRD-GTM is shown to behave with reason-
able robustness when noise is added to the first two
features of synth2, as shown in Figure 4. As in the
case of synth1, its performance deteriorates signifi-
cantly for high levels of noise. Comparing these re-
sults with those in Figures 5 and 6 (in which no noise
was added to the first two features), the overall dete-
rioration becomes evident. H1 is again partially sup-
ported by these results.
The FRD ranking results for the experiments us-
ICEIS 2008 - International Conference on Enterprise Information Systems
426