3 EVALUATION METHODS
We used the following terms to describe the utilized
methods. Our proposed image clustering method
(Rank-order Distance based clustering using a com-
bination of image similarity and emotional features)
is labeled as ‘RODE’. Scene change detection(Meng
et al., 1995) technique is labeled as ‘SC’ and with
additional emotional feature is is labeled as ‘SCE’.
As a comparison, we implemented K-mean cluster-
ing (Blighe et al., 2008), which is labeled as ‘KM’
and ‘KME’ for K-mean clustering with emotional fea-
ture. Finally, the ground truth in this study is the im-
age clustering from user, labeled as ‘USER’.
To evaluate the accuracy of each method, the im-
age clustering result (‘ROD’,‘KM’,‘SC’) and the im-
age clustering from user (‘USER’), are compared.
Considering direct comparison method, each cluster
should contain as many correct images as possible.
In this work, we follow the evaluation framework
based on similarity criteria described in (Zhu et al.,
2011). The precision (Ps) of clustering algorithm can
be measured by
Ps =
C
correct
C
total
(6)
whereC
correct
is number of clustered images in correct
group. C
total
is the number of images from USER
cluster.
4 EXPERIMENTS AND RESULTS
4.1 Experiment Setup
In our experiment, 6 participants wore the smartphone
(an Android phone with an automatic capturing ap-
plication)and biosensor for some of amount of time
(average 3 hours). Each participant is recorded over
a time period of 3.5 weeks. There are 25,451 im-
ages in 253 log events. Datasets range from daily
life activities for example using labtop, watching TV,
or shopping to more special ones such as traveling
and sightseeing. A sample of lifelog event set is
shown in Fig. 1. The proportions of the low, mid-
dle and high variance events are 41.2%, 23.2%, and
35.6% respectively. The lifelog image quality is var-
ied from high to low since it is unintentionally cap-
tured. The implemented image clustering and evalua-
tion method run on MATLAB in PC (E5420 2.50 GHz
Xeon CPU, 4096M RAM, NVIDIA Quadro FX 1700
graphic card). The processing time of each frame and
the evaluation process varies between 10 ms to 25 ms,
depending on the number of SURF keypoints in the
image stream and number of pairs between each clus-
ter and top neighbors considered in clustering pro-
cess. Biosensor are set to be time-synchronized with
the images from automatic capturing software.
In both emotion recognition and image clustering
method, there are several parameters that have to be
considered. To classify event into low, medium and
high variance event, we consider the average normal-
ized absolute distance d
a
between each consecutive
frames in that event. The criteria is as follow
1. Low Variance Event: when d
a
> 0.7
2. Medium Variance Event: when 0.4 < d
a
< 0.7
3. High Variance Event: when d
a
< 0.4
The window that apply for SVM learning and
classification is set to be within the interval 0-1.5 sec.
The low and high level are set to be from 0-2.5 µS
and 4.5 - 10 µS respectively. The dominant features
in SVM are slope, peak value and distance between
peaks. For ROD clustering algorithm, Rank-Order
distance threshold T
R
, Normalized distance threshold
T
N
and the number of top neighbors K are set to be
10, 1, and 8 in all experiments.
4.2 Experiment Results
In this section, we present the results and advan-
tages of our proposed image clustering method. SC
and KM image clustering are implemented to ana-
lyze the clustering performance compared to our pro-
posed ROD image clustering. The comparison of im-
age clustering using only image similarity features is
presented in Fig 9(a) and using combination of im-
age similarity features and emotional feature is pre-
sented in Fig. 9(b). As number of images increased,
the precision of each algorithms are decreased when
high variance event are processed and increased when
low variance event are processed.
By integrating the emotional features from biosen-
sor, all clustering techniques outperform the result
when using only the visual features. The example of
image that cluster by RODE technique are presented
in Fig. 10. Our proposed RODE clustering method
achieves 93.33%, 86.8%, and 76.5% of precision rate
in low, medium and high variance event. Other meth-
ods such as SC and KM clustering also achieve better
results with 55.7% and 26.7% of precision rate in high
variance event segmentation respectively.
The quality of the proposed image clustering re-
sult can also be measured by the number of perfect
cluster. As presented in Table 1, our proposed RODE
method achieve highest perfect clustering (52.1%)
when compare to others (44.7% in KME and 32.6%
VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications
622