Table 3: F-Measure for each method. The best performance
for each dataset, is presented in bold. values in parentheses
are the dimension of the generated embedding space.
Method Corel5k Bibtex MediaMill
OVA 0.112 0.372
CCA 0.150 0.404
CS 0.086 (50) 0.332 (50)
PLST 0.074 (50) 0.283 (50)
MME 0.178 (50) 0.403 (50) 0.199 (350)
ANMF 0.210 (30) 0.297 (140) 0.496 (350)
MNMF 0.240 (35) 0.376 (140) 0.510 (350)
OMMF 0.263 (40) 0.436 (140) 0.503 (350)
Our Method 0.283 (100) 0.422 (300) 0.540 (300)
Table 4: Convergence time for the algorithm Online Ma-
trix Factorization for Space Embedding (OMMF) and our
method Two Way Online Matrix Factorization (TWOMF).
Algorithm OMMF TWOMF
Corel 00.02.30 00.09.29
Bibtex 06.02.00 00.16.60
MediaMill 88.37.55 01:08.11
We evaluated the performance of our method in
each one of the datasets, calculating the F-Measure.
Table 3 shows the results for each baseline method
and the dimension of the embedding space. In
Corel5k and MediaMill datasets, we got the best re-
sults in comparison with the other methods and in
Bibtex we got a competitive result, being surpassed
only by OMMF method.
Table 4 shows the convergence times of the al-
gorithms Online Matrix Factorization for Space Em-
bedding (OMMF) and our method in each one of the
datasets.
By Comparing our algorithm against the OMMF,
we can see gains when dealing with larger datasets. In
Corel5k that contains only 5.000 examples, the gain
in time is not better. In the case of Bibtex and Medi-
aMill, which are larger,it is evident the improvements
in time execution using our implementation, i.e., us-
ing the pylearn2 library which makes use of the GPU.
5 CONCLUSIONS AND FUTURE
WORK
In this paper we presented a novel multi-label an-
notation method which learns a mapping between
the original sample representation and labels by find-
ing a common semantic representation. The method
was compared against state-of-art latent space embed-
ding methods showing competitive results. An im-
portant characteristic of this method is that, unlike
the method proposed by Otalora-Montenegro et al.
(Ot´alora-Montenegro et al., 2013) based on OMMF,
the transformation from the semantic representation
to the label space is learned directly in the training
phase, making the annotation process very simple, re-
quiring a simple multiplication by a transformation
matrix. Finally, Another important characteristic of
this method is its ability to deal with large collections
of data, thanks to its formulation as an online learn-
ing algorithm, achieving a significantly reduction in
memory requirements and computational load.
A major limitation in this method as well as in
other multi-label latent space embedding methods is
that it is a linear model which imposes significant re-
strictions that limit its flexibility. Therefore, as a fu-
ture work it would be interesting to explore non-linear
alternatives, which allow to model more complex re-
lationships what could improve the performance in
annotation task.
ACKNOWLEDGEMENTS
This work was partially funded by project Multi-
modal Image Retrieval to Support Medical Case-
Based Scientific Literature Search, ID R1212LAC006
by Microsoft Research LACCIR and Jorge A. Vane-
gas also thanks for doctoral grant support Colciencias.
617/2013.
REFERENCES
Akata, Z., Thurau, C., and Bauckhage, C. (2011). Non-
negative matrix factorization in multimodality data for
segmentation and label prediction. In 16th Computer
Vision Winter Workshop.
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu,
R., Desjardins, G., Turian, J., Warde-Farley, D., and
Bengio, Y. (2010). Theano: a CPU and GPU math
expression compiler. In Proceedings of the Python for
Scientific Computing Conference (SciPy). Oral Pre-
sentation.
Caicedo, J. C., BenAbdallah, J., Gonz´alez, F. A., and Nas-
raoui, O. (2012). Multimodal representation, index-
ing, automated annotation and retrieval of image col-
lections via non-negative matrix factorization. Neuro-
computing, 76(1):50–60.
Cotter, A., Shamir, O., Srebro, N., and Sridharan, K. (2011).
Better mini-batch algorithms via accelerated gradient
methods. CoRR, abs/1106.4574.
Goodfellow, I. J., Warde-Farley, D., Lamblin, P., Dumoulin,
V., Mirza, M., Pascanu, R., Bergstra, J., Bastien, F.,
and Bengio, Y. (2013). Pylearn2: a machine learning
research library. CoRR, abs/1308.4214.
Hsu, D., Kakade, S. M., Langford, J., and Zhang, T. (2009).
ICPRAM2015-InternationalConferenceonPatternRecognitionApplicationsandMethods
284