Non-negative Matrix Factorization for Binary Data
Jacob Søgaard Larsen and Line Katrine Harder Clemmensen
DTU Compute, Technical University of Denmark, Richard Petersens Plads, 2800, Lyngby, Denmark
Keywords:
Non-negative Matrix Factorization, Binary Data, Binary Matrix Factorization, Text Modelling.
Abstract:
We propose the Logistic Non-negative Matrix Factorization for decomposition of binary data. Binary data
are frequently generated in e.g. text analysis, sensory data, market basket data etc. A common method for
analysing non-negative data is the Non-negative Matrix Factorization, though this is in theory not appropriate
for binary data, and thus we propose a novel Non-negative Matrix Factorization based on the logistic link
function. Furthermore we generalize the method to handle missing data. The formulation of the method
is compared to a previously proposed logistic matrix factorization without non-negativity constraint on the
features. We compare the performance of the Logistic Non-negative Matrix Factorization to Least Squares
Non-negative Matrix Factorization and Kullback-Leibler (KL) Non-negative Matrix Factorization on sets of
binary data: a synthetic dataset, a set of student comments on their professors collected in a binary term-
document matrix and a sensory dataset. We find that choosing the number of components is an essential part
in the modelling and interpretation, that is still unresolved.
1 INTRODUCTION
Non-negative matrices are found in many different
forms, from a general matrix with non-negative en-
tries to the case with only binary entries. The lat-
ter is an interesting case used in many fields e.g.
text data, sensory data etc. A common tool for pre-
processing data by unsupervised decompsition is the
Non-negative Matrix Factorization (NMF) proposed
by Lee and Seung (Lee and Seung, 1999; Lee and Se-
ung, 2001). One issue with the general NMF is that
the resulting approximation is not bounded above, and
hence not suitable for the binary case. Zhang et al.
proposed the Binary Matrix Factorization that factor-
izes the binary data matrix X into two binary matri-
ces W and H (Zhang et al., 2010). The interpretation
of such a decomposition may be difficult, since the
method does not estimate how important an entry in
the components are, and therefore we will not con-
sider this method for our purpose. Gillis proposed that
when NMF is used on text data, the components are
interpreted as topics (Gillis, 2014). The model also
describes how important a topic is for each document
and how important a term is for a topic. We adapt
this approach and propose a logistic non.negative ma-
trix factorization. Recently, Tom
´
e et al. proposed a
logistic but only partially non-negative matrix factor-
ization, where the model allows for negative feature
components (Tom
´
e et al., 2015), whereas our method
is strictly non-negative and explicit modelling of the
threshold in the logistic sigmoidal. Tom
´
e et al. fur-
ther extended the model with a Lagrangian penalty on
the two norm of the columns of W and H. Both our
method and the methods by Tom
´
e et al. uses a gra-
dient based update scheme. Tom
´
e et al. uses a con-
stant step length, where we use an adaptive scheme
to ensure non-negativity. Tom
´
e et al. ensures non-
negativity by projection. In order to evaluate how well
the model generalizes the data, Tom
´
e et al. uses a set-
up with a test- and training-set, while we have gen-
eralized our method to handle missing data, thus en-
abling the use of cross-validation. The methods pro-
posed by Tom
´
e et al. is tested on synthetic data with
binary basis vectors and the USPS digits. The em-
phasis is put on how well the model reconstructs data,
while we focus on estimating the correct number of
feature vectors and the interpretation of the model.
Furthermore we test our model on sensory data and
text data.
The training process and selecting the model com-
plexity is another issue regarding NMF. Nielsen and
Mørup proposed to marginalize missing data in order
to perform cross-validation (CV) to choose the num-
ber of components in the model (Nielsen and Mørup,
2014), and we will use this approach in the paper.
Larsen, J. and Clemmensen, L..
Non-negative Matrix Factorization for Binary Data.
In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - Volume 1: KDIR, pages 555-563
ISBN: 978-989-758-158-8
Copyright
c
2015 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
555